rust Cow 类型的使用

March 1, 2024

一颗向上的水滴

std::borrow::Cow 是个枚举，包含 2 个变体（rust 中枚举的元素称为 Variant，变体）Borrowed、Owned，一般理解为代表对资源、数据的两种状态：借用、拥有所有权。

pub enum Cow<'a, B>
where
    B: 'a + ToOwned + ?Sized,
{
    Borrowed(&'a B),
    Owned(<B as ToOwned>::Owned),
}

两种状态，比较难理解，其实是从使用用途来说的，具体到代码中后面会看到并不难理解。

创建

let st = "foobar";
let sg = String::from("foobar");

直接创建

我们可以像通用的枚举使用方式一样，分别创建：

let b_st = Cow::Borrowed(st);

Borrowed 的绑定值是 & 引用类型的， &str 类型的满足要求，此时泛型类型 B 是 str，从定义上看 B 需要满足 ToOwned，str 也是满足的。

其他类型的就要添加 &：

let b_sg_p = Cow::Borrowed(&sg);

Owned 变量的创建：

let o_st: Cow<'_, &str> = Cow::Owned(st);

此时泛型类型 B 是 &str，和上面的 Borrowed 是不同的，并且上面显式给出了类型说明 Cow<'_, &str>，而前面 Borrowed 的 b_st 的类型是 Cow<'_, str>，两种是不同的。

用 from 创建

不同的 from() 得到不同的 Borrowed、Owned，如：

let f_st = Cow::from(st);

会使用如下定义

fn from(s: &'a str) -> Cow<'a, str> {
    Cow::Borrowed(s)
}

所以 f_st 的类型是 Cow<'a, str>::Borrowed("foobar")。

let f_sg = Cow::from(sg);

会使用

fn from(s: String) -> Cow<'a, str> {
    Cow::Owned(s)
}

所以 f_sg 的类型是 Cow<'a, str>::Owned("foobar")。

let f_sg_p = Cow::from(&sg);

会使用

fn from(s: &'a String) -> Cow<'a, str> {
    Cow::Borrowed(s.as_str())
}

所以 f_sg_p 的类型是 Borrowed("foobar")。

总结

from(入参) 入参是引用类型，创建出来的是 Borrowed；入参不是引用类型，创建出来 Owned。

方法

`to_owned(&self)` 和 `into_owned(self)`

定义分别是：

fn to_owned(&self) -> T {
    self.clone()
}
fn into_owned(self) -> <B as ToOwned>::Owned {
    match self {
        Borrowed(borrowed) => borrowed.to_owned(),
        Owned(owned) => owned,
    }
}

所以不管是 Borrowed 还是 Owned ，to_owned 都会做 clone 操作。
rust 很多类型都有 to_owned(&self)，都是对原值的不可变引用做数据复制，并给出有所有权的新数据。但 rust 只有 Cow 有 into_owned(self)
into_owned(self) 是对 self 的所有权转移，所以执行本方法后 self 就无法继续使用了。
- Borrowed 类型的变量会先做数据的 to_owned，borrowed 数据类型不同，to_owned 是不同方法，如：
  - borrowed 是 str，则调用 fn to_owned(&self) -> String，返回 String，并且数据已 clone；
  - borrowed 是 String，则调用 fn to_owned(&self) -> T，返回 String
  - borrowed 是 Cow::Borrowed，则调用上面的 fn to_owned(&self) -> T，即 clone()
- Owned 类型的变量则直接返回数据的引用（owned）。

let b = Cow::Borrowed("foobar"); // b type: Cow<'_, str>::Borrowed()
let bto = b.to_owned(); // bto 与 b type 相同: Cow<'_, str>::Borrowed()
let mut bio = b.into_owned(); // bio type：String，因为 str::to_owned() 返回 String
assert_eq!(bio, String::from("foobar"));
bio.make_ascii_uppercase();
assert_eq!(bio, String::from("FOOBAR"));

let o: Cow<'_, &str> = Cow::Owned("foobar");
let oto = o.to_owned(); // oto 与 o type 相同： Cow<'_, &str>::Owned()
let oio = o.into_owned(); // oio type: &str -- Owned 直接返回，不做 to_owned，所以还是 str，不是 String
assert_eq!(oio, "foobar");

`to_mut(&mut self)`

to_mut() 实现了不越权（遇到 Borrowed 先 clone，然后把 self owned 到新数据）的返回数据的可变引用，修改后原 Cow 变量如果是 Borrowed 则不影响（因为 clone 了），如果 Owned 则影响。

方法定义和实现也很直白：

pub fn to_mut(&mut self) -> &mut <B as ToOwned>::Owned {
    match *self {
        Borrowed(borrowed) => {
            *self = Owned(borrowed.to_owned());
            match *self {
                Borrowed(..) => unreachable!(),
                Owned(ref mut owned) => owned,
            }
        }
        Owned(ref mut owned) => owned,
    }
}

该函数做可变借用，不转移所有权。
Owned 变量返回可
Borrowed 变量把 self 指向新数据的 Owned 类型

let sg = String::from("foobar");
let mut mb = Cow::Borrowed(&sg);
let mbtm = mb.to_mut();
// to_mut() 之后：
// mb 作为 self 也被指向了 clone 后的新数据，并变为 Owned
// mb 变成了 owned 新数据 Owned 变量，和 sg 已解耦
// mbtm 类型：String
// mbtm 是 mb 内层数据的可变引用
mbtm.make_ascii_uppercase();
// mbtm 是 mb 的可变借用，所以 mbtm 先 drop 后，mb 才能继续使用
println!("{}", mbtm); // 此行后 mbtm drop
println!("{}", mb);
// output:
// FOOBAR
// FOOBAR

从打印可见：mb 和 mbtm 都被修改为大写的 FOOBAR。Owned 同理，也会同步修改。

总结

方法	入参	返回值	原理	使用场景	备注
`to_owned(&self)`	不可变借用	返回有所有权的复制后的新数据，类型不变（还是 Cow）	clone 新数据，与原变量解耦，所以用不可变借用（不想修改原值，复制完就解耦了）	需要有所有权的数据，赋值原数据后各自发展	不光 Cow 有，其他很多类型都有，模式都是统一的。
`into_owned(self)`	所有权转移	返回有所有权的（内层、原始）数据（类型不是 Cow）	Borrowed 先对内层数据做 to_owned（clone），Owned 直接返回	修改 Cow 中数据，拿到所有权不退	Cow 独有，into 理解为剥去 Cow 封装，对原始数据做 to_owned。
`to_mut(&mut self)`	可变借用	返回（内层、原始）数据的可变引用（类型不是 Cow）	Borrowed 先对内层数据做 to_owned（clone）并封装成 Owned，相比 `into_owned()`，此方法修改了 self	修改 Cow 中数据，不影响 Cow 变量	Cow 独有，其他类型多为 `into_mut(self)` 或 `as_mut(self)`。

into_owned(self) 对 Borrowed 做 into_owned 似乎不公平，复制了原变量数据，同时还让原变量失效了

应用

写时复制

官方给的 2 个例子，都演示了如何使用 to_mut() 让 Cow 在读多写少时，按需 clone 数据

fn abs_all(input: &mut Cow<'_, [i32]>) {
    for i in 0..input.len() {
        let v = input[i];
        if v < 0 {
            // to_mut()： 遇到 Borrowed 先 Clone，遇到 Owned 不 Clone
            input.to_mut()[i] = -v;
        }
    }
}

let slice = [0, 1, 2];
let mut input = Cow::from(&slice[..]); // input 类型： Cow<'_, [i32]>::Borrowed()
abs_all(&mut input); // 不执行 to_mut()，所以 input 与 slice 没分身

let slice = [-1, 0, 1];
let mut input = Cow::from(&slice[..]); // input 类型： Cow<'_, [i32]>::Borrowed()
abs_all(&mut input); // 执行了 to_mut()吗，所以 input 与 slice 分身了

let mut input = Cow::from(vec![-1, 0, 1]); // input 类型：Owned
abs_all(&mut input); // 执行了 to_mut()吗，但对应 Onwed 不做 clone

上面例子演示了对数组或切片取绝对值的操作，想达到的效果是：假如包含负数的数组是少数（例子中 3 个数组有 2 个，给人感觉反了），那么调用 abs_all 应该对大多数的没有负数的数组不做数据复制操作，而仅对少数有负数的数组做复制，并修改。

最终达到了：有所有权的立即修改、没所有权的写时复制修改。

统一函数入参

Cow 还有另外一个作用，一个函数入参可包含借用或所有权转移，这样设计函数的时候有更好的兼容性。

比如

fn foo(s: i32) {
    println!("{}", s);
}

fn bar(s: &i32) {
    println!("{}", s);
}

如何整合这 2 个函数？难道：

fn foobar(s: i32, t: &i32) {...}

这样 2 个入参对函数内部实现也造成混乱。

fn foobar(s: Cow<'_, i32>) {
    match s {
        Cow::Borrowed(s) => println!("{}", s),
        Cow::Owned(s) => println!("{}", s),
    }
}
foobar(Cow::Borrowed(&123));
foobar(Cow::Owned(123));

这样就漂亮的解决了问题。

总结

创建成 Borrowed 还是 Owned 并不重要，只需注意所有权要不要移交。
用 Cow 封装后，一个类型传递更方便。
需要修改被 Cow 封装的数据时，使用 into_owned 或 to_mut 取出数据，前者拿走所有权修改数据，后者通过可变借用修改数据。此时如果是 Borrowed 则先复制数据（即：写时复制）。
写时复制——仅针对 Borrowed，不发生在 Owned。

创建​

直接创建​

用 from 创建​

总结​

方法​

to_owned(&self) 和 into_owned(self)​

to_mut(&mut self)​

总结​

应用​

写时复制​

统一函数入参​

总结​

创建