◄ 上一步目录下一步 ►

0403: 序列化与排序

主键比较顺序

KV 里的 key 是按照字符串来比较的，也就是 bytes.Compare()。而关系型数据库的数据是有类型的，序列化之后的 key，比较顺序不对。

要修正这个问题，可以在比较前将 key 反序列化，然后按照具体的类型来比较。如果这么做的话，KV 要依赖上层的 schema，模块间耦合严重。所以有些数据库使用另一种办法：设计一种序列化方法，使得结果保持比较顺序。

原来的 Cell.Encode() 函数，重命名为 Cell.EncodeVal()，用来对 KV 里的 V 序列化。
新增 Cell.EncodeKey() 函数，用来对 KV 里的 K 序列化。
Cell.Decode() 函数也作相应的修改。

// 新增
func (cell *Cell) EncodeKey(toAppend []byte) []byte
func (cell *Cell) DecodeKey(data []byte) (rest []byte, err error)
// 原有
func (cell *Cell) EncodeVal(toAppend []byte) []byte
func (cell *Cell) DecodeVal(data []byte) (rest []byte, err error)

重新实现 Row 的相关方法，调用新增的 Cell 方法：

func (row Row) EncodeKey(schema *Schema) (key []byte)
func (row Row) DecodeKey(schema *Schema, key []byte) (err error)

保持比较顺序的整数序列化方法

这个办法编码 int64，能够保持比较顺序：

func (cell *Cell) EncodeKey(out []byte) []byte {
    switch cell.Type {
    case TypeI64:
        return binary.BigEndian.AppendUint64(out, uint64(cell.I64)^(1<<63))
    case TypeStr:
        // TODO
    }
}

具体解释可以看完整版教程。

保持比较顺序的字符串序列化方法

字符串序列化成 null-terminated string 就能直接比较。同时要对其中的0字节转义：

0x00 ⇔ 0x01 0x01
0x01 ⇔ 0x01 0x02

func encodeStrKey(toAppend []byte, input []byte) []byte
func decodeStrKey(data []byte) (out []byte, rest []byte, err error)

您正在阅读免费版教程，从第4章起只有简单的指引，适合爱好挑战和自学的读者。
可以购买有详细指导+背景知识的完整版。

◄ 上一步目录下一步 ►