Sync Map 源码解析

Sync map

Posted by WR on December 3, 2022

基础概念

​ 众所周知,Go 语言的 Map 是非并发安全的,因此 Go 官方提供了一个并发安全的 Map 实现 Sync.map

并发安全的 Map 有哪些实现方式 ?

  1. 一个大的 Map 划分成多个分片,每个分片持有一把锁
  2. 读写分离,读和写分别是一个 Map

​ 第一中方式,orcaman 提供了这个思路的一个实现: concurrent-map,而 sync.map 采用了第二种实现方式

实现原理

  1. 空间换时间。 通过冗余的两个数据结构(read、dirty),实现加锁对性能的影响。
  2. 使用只读数据(read),避免读写冲突。
  3. 动态调整,miss次数多了之后,将dirty数据提升为read。
  4. double-checking。
  5. 延迟删除。 删除一个键值只是打标记,只有在提升dirty的时候才清理删除的数据。
  6. 优先从read读取、更新、删除,因为对read的读取不需要锁。

源码分析

sync map 结构
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
// Map is like a Go map[interface{}]interface{} but is safe for concurrent use
// by multiple goroutines without additional locking or coordination.
// Loads, stores, and deletes run in amortized constant time.
//
// The Map type is specialized. Most code should use a plain Go map instead,
// with separate locking or coordination, for better type safety and to make it
// easier to maintain other invariants along with the map content.
//
// The Map type is optimized for two common use cases: (1) when the entry for a given
// key is only ever written once but read many times, as in caches that only grow,
// or (2) when multiple goroutines read, write, and overwrite entries for disjoint
// sets of keys. In these two cases, use of a Map may significantly reduce lock
// contention compared to a Go map paired with a separate Mutex or RWMutex.
//
// The zero Map is empty and ready for use. A Map must not be copied after first use.
type Map struct {
	mu Mutex

	// read contains the portion of the map's contents that are safe for
	// concurrent access (with or without mu held).
	//
	// The read field itself is always safe to load, but must only be stored with
	// mu held.
	//
	// Entries stored in read may be updated concurrently without mu, but updating
	// a previously-expunged entry requires that the entry be copied to the dirty
	// map and unexpunged with mu held.
  // 一个只读的数据结构,因为只读,所以不会有读写冲突。
  // 所以从这个数据中读取总是安全的。
  // 实际上,实际也会更新这个数据的 entries,如果entry是未删除的(unexpunged), 并不需要加锁。如果entry已经被	// 删除了,需要加锁,以便更新dirty数据。
	read atomic.Value // readOnly

	// dirty contains the portion of the map's contents that require mu to be
	// held. To ensure that the dirty map can be promoted to the read map quickly,
	// it also includes all of the non-expunged entries in the read map.
	//
	// Expunged entries are not stored in the dirty map. An expunged entry in the
	// clean map must be unexpunged and added to the dirty map before a new value
	// can be stored to it.
	//
	// If the dirty map is nil, the next write to the map will initialize it by
	// making a shallow copy of the clean map, omitting stale entries.
  // dirty数据包含当前的map包含的entries,它包含最新的entries(包括read中未删除的数据,虽有冗余,但是提升		    	// dirty字段为read的时候非常快,不用一个一个的复制,而是直接将这个数据结构作为read字段的一部分),有些数据还可   	// 能没有移动到read字段中。
	// 对于dirty的操作需要加锁,因为对它的操作可能会有读写竞争。
	// 当dirty为空的时候, 比如初始化或者刚提升完,下一次的写操作会复制read字段中未删除的数据到这个数据中。
	dirty map[interface{}]*entry

	// misses counts the number of loads since the read map was last updated that
	// needed to lock mu to determine whether the key was present.
	//
	// Once enough misses have occurred to cover the cost of copying the dirty
	// map, the dirty map will be promoted to the read map (in the unamended
	// state) and the next store to the map will make a new dirty copy.
  // 当从Map中读取entry的时候,如果read中不包含这个entry,会尝试从dirty中读取,这个时候会将misses加一,
  // 当misses累积到 dirty的长度的时候, 就会将dirty提升为read,避免从dirty中miss太多次。因为操作dirty需要      	 // 加锁。
	misses int
}

​ Read 的数据结构是 ReadOnly

1
2
3
4
5
// readOnly is an immutable struct stored atomically in the Map.read field.
type readOnly struct {
	m       map[interface{}]*entry
	amended bool // true if the dirty map contains some key not in m.
}

​ amended 表明 dirty 中有 read 中未包含的数据,当 amended 为 true,在 read 中找不到时,还需要去 dirty 中查。

​ 虽然 read 和 dirty 有冗余数据,但这些数据是通过指针指向同一个数据,所以尽管Map的value会很大,但是冗余的 空间占用还是有限的。

readOnly.mMap.dirty存储的值类型是*entry,它包含一个指针p, 指向用户存储的value值。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// An entry is a slot in the map corresponding to a particular key.
type entry struct {
	// p points to the interface{} value stored for the entry.
	//
	// If p == nil, the entry has been deleted and m.dirty == nil.
	//
	// If p == expunged, the entry has been deleted, m.dirty != nil, and the entry
	// is missing from m.dirty.
	//
	// Otherwise, the entry is valid and recorded in m.read.m[key] and, if m.dirty
	// != nil, in m.dirty[key].
	//
	// An entry can be deleted by atomic replacement with nil: when m.dirty is
	// next created, it will atomically replace nil with expunged and leave
	// m.dirty[key] unset.
	//
	// An entry's associated value can be updated by atomic replacement, provided
	// p != expunged. If p == expunged, an entry's associated value can be updated
	// only after first setting m.dirty[key] = e so that lookups using the dirty
	// map find the entry.
	p unsafe.Pointer // *interface{}
}

func (e *entry) load() (value interface{}, ok bool) {
	p := atomic.LoadPointer(&e.p)
	if p == nil || p == expunged {
		return nil, false
	}
	return *(*interface{})(p), true
}

​ 结合注释可知,p 有三种值,nil、expunged、实际值

​ nil: entry已被删除了,并且m.dirty为nil

​ expunged: entry已被删除了,并且m.dirty不为nil,而且这个entry不存在于m.dirty中

Sync Map Load 方法
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// Load returns the value stored in the map for a key, or nil if no
// value is present.
// The ok result indicates whether value was found in the map.
func (m *Map) Load(key interface{}) (value interface{}, ok bool) {
	read, _ := m.read.Load().(readOnly)
	e, ok := read.m[key]
	if !ok && read.amended {
		m.mu.Lock()
		// Avoid reporting a spurious miss if m.dirty got promoted while we were
		// blocked on m.mu. (If further loads of the same key will not miss, it's
		// not worth copying the dirty map for this key.)
		read, _ = m.read.Load().(readOnly)
		e, ok = read.m[key]
		if !ok && read.amended {
			e, ok = m.dirty[key]
			// Regardless of whether the entry was present, record a miss: this key
			// will take the slow path until the dirty map is promoted to the read
			// map.
			m.missLocked()
		}
		m.mu.Unlock()
	}
	if !ok {
		return nil, false
	}
	return e.load()
}
  • 首先读取 read 里面的值,如果key对应的值存在,则直接返回,整个过程是无锁操作。如果key对应的值不存在并且amended==true(说明dirty里面有read不包含的值)则从dirty 里面查找
  • 查找之前先进行加锁,read 是并发安全的,dirty 是非并发安全的所以操作之前需要加锁
  • 加完锁之后在读取一遍 read 看值是否存在, !ok && read.amended, m.mu.Lock() 因为这两句是非原子的,所以加完锁之前 dirty 有可能提升为 read
  • 然后去 dirty 里面查找值,并且不管 dirty 里面有没有都要进行一次 missLocked 操作
  • 释放锁
missLocked
1
2
3
4
5
6
7
8
9
func (m *Map) missLocked() {
	m.misses++
	if m.misses < len(m.dirty) {
		return
	}
	m.read.Store(readOnly{m: m.dirty})
	m.dirty = nil
	m.misses = 0
}
  • 给 misses加1
  • 如果 misses 小于 dirty的长度,则返回,不进行 dirty 提升
  • 否则,把 dirty 提升为 read
  • 把 dirty 重置

​ 总而言之就是当miss次数过多时,会把 dirty 提升为 read

load 过程概述:

​ 先读取 read,如果 read 里面有则直接返回,整个过程是无锁操作,如果 read 里面没有,并且 dirty 里面有 read 里面不存在的值,则进入到 dirty 里面查找,dirty 不同于 read 需要加锁,读取dirty的过程还要进行一次 missLocked 操作,主要用于miss加值及判断是否需要把 dirty 提升为 read。

Sync Map Store
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Store sets the value for a key.
func (m *Map) Store(key, value interface{}) {
	read, _ := m.read.Load().(readOnly)
	if e, ok := read.m[key]; ok && e.tryStore(&value) {
		return
	}

	m.mu.Lock()
	read, _ = m.read.Load().(readOnly)
	if e, ok := read.m[key]; ok {
		if e.unexpungeLocked() {
			// The entry was previously expunged, which implies that there is a
			// non-nil dirty map and this entry is not in it.
			m.dirty[key] = e
		}
		e.storeLocked(&value)
	} else if e, ok := m.dirty[key]; ok {
		e.storeLocked(&value)
	} else {
		if !read.amended {
			// We're adding the first new key to the dirty map.
			// Make sure it is allocated and mark the read-only map as incomplete.
			m.dirtyLocked()
			m.read.Store(readOnly{m: read.m, amended: true})
		}
		m.dirty[key] = newEntry(value)
	}
	m.mu.Unlock()
}
  • 首先会读取 read 的值,如果 read 包含对应 key,则使用 tryStore 方法进行更新,如果更新成功则返回

    • tryStore 方法,判断key是否标记为删除,如果标记为删除则返回false,否则不断进行乐观更新

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      
      // tryStore stores a value if the entry has not been expunged.
      //
      // If the entry is expunged, tryStore returns false and leaves the entry
      // unchanged.
      func (e *entry) tryStore(i *interface{}) bool {
         for {
            p := atomic.LoadPointer(&e.p)
            if p == expunged {
               return false
            }
            if atomic.CompareAndSwapPointer(&e.p, p, unsafe.Pointer(i)) {
               return true
            }
         }
      }
      
  • 更新失败则进行加锁
  • 并再次读取 read 查看值是否存在,因为加锁之前 dirty 有可能被提升为 read
  • read 有值,如果之前是标记为删除,则更新值为未删除,并且更新值到 dirty 里面,然后更新 entry 值
  • 否则如果 dirty 存在对应的值,则更新 dirty 的值
  • 否则如果 read 和 dirty 里面都不存在对应值
    • 如果 amended 为 false
      • 进行 dirtyLocked,这个方法是把 read 里面未被标记删除的值拷贝到 dirty 里面(如果数据量过大可能影响性能)
      • 把 amended 标记为 true,这样下次 load 的时候发现值未命中则进入 dirty 查找,并且有可能把 dirty 提升为 read
    • 把值更新到 dirty 里面
  • 释放锁
dirtyLocked
1
2
3
4
5
6
7
8
9
10
11
12
13
func (m *Map) dirtyLocked() {
   if m.dirty != nil {
      return
   }

   read, _ := m.read.Load().(readOnly)
   m.dirty = make(map[interface{}]*entry, len(read.m))
   for k, e := range read.m {
      if !e.tryExpungeLocked() {
         m.dirty[k] = e
      }
   }
}

​ 把 read 里面未被标记为删除的值拷贝到 dirty

Sync Map Delete
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// Delete deletes the value for a key.
func (m *Map) Delete(key interface{}) {
	m.LoadAndDelete(key)
}


// LoadAndDelete deletes the value for a key, returning the previous value if any.
// The loaded result reports whether the key was present.
func (m *Map) LoadAndDelete(key interface{}) (value interface{}, loaded bool) {
	read, _ := m.read.Load().(readOnly)
	e, ok := read.m[key]
	if !ok && read.amended {
		m.mu.Lock()
		read, _ = m.read.Load().(readOnly)
		e, ok = read.m[key]
		if !ok && read.amended {
			e, ok = m.dirty[key]
			delete(m.dirty, key)
			// Regardless of whether the entry was present, record a miss: this key
			// will take the slow path until the dirty map is promoted to the read
			// map.
			m.missLocked()
		}
		m.mu.Unlock()
	}
	if ok {
		return e.delete()
	}
	return nil, false
}
  • 首先查看 read 中是否存在对应的值,如果存在则调用 e.delete() 进行删除,这个方法只是把对应的值标记为删除
  • 如果不存在并且 dirty 里面有 read 不存在的值,则查看 dirty
  • 这里同样使用了双检测策略,前面已经讲过,这里不再赘述
  • 如果dirty 里面存在对应的值,则进行删除
  • 否则返回 false

适用场景

​ 由以上分析可知,Sync map 适用于 读多写少的场景,因为写多读多的场景由于,频繁的 miss 导致不得不每次查找 dirty,由于查找 dirty 需要加锁,从而导致性能下降。而且在 store 的过程中由于会有把 read 复制到 dirty 的操作,大量复制也会导致性能下降,尤其是数据量较大的场景。所以尽管是读多写少的场景,如果数据量过大,也可能会有性能抖动。

引用

sync.map 揭秘 (https://colobu.com/2017/07/11/dive-into-sync-Map/)