1、split
当在创建表的时候,Hbase默认的分配一个region,所有的读写请求都会访问同一个regionServer的同一个region中,这个时候就达不到负载均衡的效果了,集群中的其他regionServer就可能会 处于比较空闲的状态。解决这个问题可以用split,在创建table的时候就配置好,生成多个region。
在table初始化的时候如果不配置的话,Hbase是不知道如何去split region的,因为Hbase不知道应该那个row key可以作为split的开始点。如果我们可以大概预测到row key的分布,我们可以使用split来帮助提前split region。不过如果我们预测得不准确的话,还是可能导致某个region过热,被集中访问,不过还好我们还有auto-split。好的办法就是首先预测split的切分点,然后后面让auto-split来处理后面的负载均衡。
Hbase自带了两种split的算法,分别是HexStringSplit 和UniformSplit 。如果我们的row key是十六进制的字符串作为前缀的,就比较适合用HexStringSplit,作为pre-split的算法。例如,我们使用 HexHash(prefix)作为row key的前缀,其中Hexhash为终得到十六进制字符串的hash算法。我们也可以用我们自己的split算法。
2、创建spilt表
- hbase(main):018:> create 'mytab','cf1',{SPLITS=>['10','20','30','40','50','60','70','80','90','100']}
- 0 row(s) in 1.1810 seconds
3、查看元数据
- hbase(main):022:* scan '.META.'
- ROW COLUMN+CELL
- hivetest,,1441805677008.42b9d2fff35c183 column=info:regioninfo, timestamp=1441805677121, value={NAME => 'hivetest,,1441805677008.42b9d2fff35c18393058668f77e
- 93058668f77e7b86e. 7b86e.', STARTKEY => '', ENDKEY => '', ENCODED => 42b9d2fff35c18393058668f77e7b86e,}
- hivetest,,1441805677008.42b9d2fff35c183 column=info:server, timestamp=1441810207113, value=slave3:60020
- 93058668f77e7b86e.
- hivetest,,1441805677008.42b9d2fff35c183 column=info:serverstartcode, timestamp=1441810207113, value=1441810138459
- 93058668f77e7b86e.
- mytab,,1441819038526.7b1b646f3237bf7237 column=info:server, timestamp=1441819040882, value=slave1:60020
- 381a454729ee5d.
- mytab,,1441819038526.7b1b646f3237bf7237 column=info:serverstartcode, timestamp=1441819040882, value=1441810262755
- 381a454729ee5d.
- mytab,,1441819566171.b640abdb82a91acb98 column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,,1441819566171.b640abdb82a91acb98844d109ad03b
- 844d109ad03ba7. a7.', STARTKEY => '', ENDKEY => '10', ENCODED => b640abdb82a91acb98844d109ad03ba7,}
- mytab,,1441819566171.b640abdb82a91acb98 column=info:server, timestamp=1441819568635, value=slave1:60020
- 844d109ad03ba7.
- mytab,,1441819566171.b640abdb82a91acb98 column=info:serverstartcode, timestamp=1441819568635, value=1441810262755
- 844d109ad03ba7.
- mytab,10,1441819566171.9c6d2a8e69c5aea3 column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,10,1441819566171.9c6d2a8e69c5aea308b79debf82c
- 08b79debf82cc910. c910.', STARTKEY => '10', ENDKEY => '100', ENCODED => 9c6d2a8e69c5aea308b79debf82cc910,}
- mytab,10,1441819566171.9c6d2a8e69c5aea3 column=info:server, timestamp=1441819560693, value=slave2:60020
- 08b79debf82cc910.
- mytab,10,1441819566171.9c6d2a8e69c5aea3 column=info:serverstartcode, timestamp=1441819560693, value=1441810254954
- 08b79debf82cc910.
- mytab,100,1441819566172.accd15db4e78d03 column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,100,1441819566172.accd15db4e78d03c8a53cd6b1e1
- c8a53cd6b1e17f185. 7f185.', STARTKEY => '100', ENDKEY => '20', ENCODED => accd15db4e78d03c8a53cd6b1e17f185,}
- mytab,100,1441819566172.accd15db4e78d03 column=info:server, timestamp=1441819540642, value=slave3:60020
- c8a53cd6b1e17f185.
- mytab,100,1441819566172.accd15db4e78d03 column=info:serverstartcode, timestamp=1441819540642, value=1441810138459
- c8a53cd6b1e17f185.
- mytab,20,1441819566172.f81699b3cefb992f column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,20,1441819566172.f81699b3cefb992fa5c8e5a0fd1f
- a5c8e5a0fd1f133b. 133b.', STARTKEY => '20', ENDKEY => '30', ENCODED => f81699b3cefb992fa5c8e5a0fd1f133b,}
- mytab,20,1441819566172.f81699b3cefb992f column=info:server, timestamp=1441819560897, value=slave2:60020
- a5c8e5a0fd1f133b.
- mytab,20,1441819566172.f81699b3cefb992f column=info:serverstartcode, timestamp=1441819560897, value=1441810254954
- a5c8e5a0fd1f133b.
- mytab,30,1441819566172.97e21ce593d16c77 column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,30,1441819566172.97e21ce593d16c77d35c253c447c
- d35c253c447c176d. 176d.', STARTKEY => '30', ENDKEY => '40', ENCODED => 97e21ce593d16c77d35c253c447c176d,}
- mytab,30,1441819566172.97e21ce593d16c77 column=info:server, timestamp=1441819568737, value=slave1:60020
- d35c253c447c176d.
- mytab,30,1441819566172.97e21ce593d16c77 column=info:serverstartcode, timestamp=1441819568737, value=1441810262755
- d35c253c447c176d.
- mytab,40,1441819566172.c17b2121db16f1f8 column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,40,1441819566172.c17b2121db16f1f8938077016bfc
- 938077016bfc524d. 524d.', STARTKEY => '40', ENDKEY => '50', ENCODED => c17b2121db16f1f8938077016bfc524d,}
- mytab,40,1441819566172.c17b2121db16f1f8 column=info:server, timestamp=1441819560776, value=slave2:60020
- 938077016bfc524d.
- mytab,40,1441819566172.c17b2121db16f1f8 column=info:serverstartcode, timestamp=1441819560776, value=1441810254954
- 938077016bfc524d.
- mytab,50,1441819566172.4c6f25b2a4469ba6 column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,50,1441819566172.4c6f25b2a4469ba63be2a28e9f79
- 3be2a28e9f79681d. 681d.', STARTKEY => '50', ENDKEY => '60', ENCODED => 4c6f25b2a4469ba63be2a28e9f79681d,}
- mytab,50,1441819566172.4c6f25b2a4469ba6 column=info:server, timestamp=1441819568670, value=slave1:60020
- 3be2a28e9f79681d.
- mytab,50,1441819566172.4c6f25b2a4469ba6 column=info:serverstartcode, timestamp=1441819568670, value=1441810262755
- 3be2a28e9f79681d.
- mytab,60,1441819566172.cf803d46bd2fd225 column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,60,1441819566172.cf803d46bd2fd225cce52d2aad18
- cce52d2aad18314a. 314a.', STARTKEY => '60', ENDKEY => '70', ENCODED => cf803d46bd2fd225cce52d2aad18314a,}
- mytab,60,1441819566172.cf803d46bd2fd225 column=info:server, timestamp=1441819540671, value=slave3:60020
- cce52d2aad18314a.
- mytab,60,1441819566172.cf803d46bd2fd225 column=info:serverstartcode, timestamp=1441819540671, value=1441810138459
- cce52d2aad18314a.
- mytab,70,1441819566172.58be7af75c6d124a column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,70,1441819566172.58be7af75c6d124aad1a8385f741
- ad1a8385f7410224. 0224.', STARTKEY => '70', ENDKEY => '80', ENCODED => 58be7af75c6d124aad1a8385f7410224,}
- mytab,70,1441819566172.58be7af75c6d124a column=info:server, timestamp=1441819540765, value=slave3:60020
- ad1a8385f7410224.
- mytab,70,1441819566172.58be7af75c6d124a column=info:serverstartcode, timestamp=1441819540765, value=1441810138459
- ad1a8385f7410224.
- mytab,80,1441819566173.fc2ed8a43e2eaf02 column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,80,1441819566173.fc2ed8a43e2eaf02238d7e1bc398
- 238d7e1bc398cc30. cc30.', STARTKEY => '80', ENDKEY => '90', ENCODED => fc2ed8a43e2eaf02238d7e1bc398cc30,}
- mytab,80,1441819566173.fc2ed8a43e2eaf02 column=info:server, timestamp=1441819560794, value=slave2:60020
- 238d7e1bc398cc30.
- mytab,80,1441819566173.fc2ed8a43e2eaf02 column=info:serverstartcode, timestamp=1441819560794, value=1441810254954
- 238d7e1bc398cc30.
- mytab,90,1441819566173.76ce4f3cde98abd2 column=info:regioninfo, timestamp=1441819540261, value={NAME => 'mytab,90,1441819566173.76ce4f3cde98abd2e7679b099489
- e7679b099489d582. d582.', STARTKEY => '90', ENDKEY => '', ENCODED => 76ce4f3cde98abd2e7679b099489d582,}
- mytab,90,1441819566173.76ce4f3cde98abd2 column=info:server, timestamp=1441819568801, value=slave1:60020
- e7679b099489d582.
- mytab,90,1441819566173.76ce4f3cde98abd2 column=info:serverstartcode, timestamp=1441819568801, value=1441810262755
- e7679b099489d582.
- test,,1441718128275.edca99f7a3a627594f2 column=info:regioninfo, timestamp=1441718128579, value={NAME => 'test,,1441718128275.edca99f7a3a627594f2ab1af7e7cb1a
- ab1af7e7cb1ad. d.', STARTKEY => '', ENDKEY => '', ENCODED => edca99f7a3a627594f2ab1af7e7cb1ad,}
- test,,1441718128275.edca99f7a3a627594f2 column=info:server, timestamp=1441810207198, value=slave3:60020
- ab1af7e7cb1ad.
- test,,1441718128275.edca99f7a3a627594f2 column=info:serverstartcode, timestamp=1441810207198, value=1441810138459
- ab1af7e7cb1ad.
- 14 row(s) in 0.3110 seconds
4、使用UI查看表的大致情况
可以看到表mytab有11个在线的region。
这里有起始start key和end key的位置。
插入key/value,以此来分布数据,达到负载的效果。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/12219480/viewspace-1797589/,如需转载,请注明出处,否则将追究法律责任。