terry-mao / bfs Goto Github PK
View Code? Open in Web Editor NEWdistributed file system(small file storage) writen in golang.
distributed file system(small file storage) writen in golang.
我看到在proxy中,文件上传的逻辑,是通过directory获取stores,然后遍历stores,将文件上传到volume对应的store中,如果部分上传到失败,会直接返回,这样是不是会有文件残留,使得各个volume大小不同,会导致有些volume已经写满,有些未写满的状况出现?
我看了volume中的compact逻辑,貌似里面对于删除了的文件,直接跳过,不写入新的volume中,没对多个volume中,文件差异进行处理。
可能调查不是很详细,先假定是这样的流程,我想问一下,有很多时候,网络稍微不大稳定,很容易出现写入的失败的情况,这样是不是导致文件残留现象比较频繁出现?我可以怎样处理这个问题呢?
params = url.Values{}
for _, host = range res.Stores {
params.Set("key", strconv.FormatInt(res.Key, 10))
params.Set("cookie", strconv.FormatInt(int64(res.Cookie), 10))
params.Set("vid", strconv.FormatInt(int64(res.Vid), 10))
uri = fmt.Sprintf(_storeUploadApi, host)
if err = Http("POST", uri, params, buf, &sRet); err != nil {
return
}
if sRet.Ret != 1 {
log.Errorf("http.Post store sRet.Ret: %d %s %d %d %d", sRet.Ret, uri, res.Key, res.Cookie, res.Vid)
err = errors.ErrInternal
return
}
}
if res.Ret == errors.RetNeedleExist {
err = errors.ErrNeedleExist
}
log.Infof("bfs.upload bucket:%s filename:%s key:%d cookie:%d vid:%d", bucket, filename, res.Key, res.Cookie, res.Vid)
return
2017/07/04 18:10:42 Connected to 172.20.2.92:2181 2017/07/04 18:10:42 Authenticated: id=242280265568797060, timeout=6000 2017/07/04 18:10:42 Re-submitting
0` credentials after reconnect
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x768225]
goroutine 1 [running]:
main.NewServer(0xc420020cc0, 0xc42001e120, 0x0, 0x0, 0x0)
/home/xxx/go/src/bfs/store/http.go:36 +0x45
main.main()
/home/xxx/go/src/bfs/store/main.go:35 +0x2f3`
如上,使用https://github.com/Terry-Mao/bfs/blob/master/doc/store.md#config的默认配置,出现上面的错误,尝试下面的配置依然报上面的错误
[limit] [read] rate = 150.0 brust = 200 [write] rate = 150.0 brust = 200 [delete] rate = 150.0 brust = 200
"bfs/libs/errors" 应该 “github.com/Terry-Mao/bfs/libs/errors” 类似的问题,使得项目无法正常编译
在directory 中的STORE GROUP VOLUME内存信息 在syncZookeeper go route 中定时更新,和在应用操作api 的go route 中使用这些公共内存资源 不需要加锁么?
想了解在实际操作的过程中,BFS是如何初始化机器的。多谢!
问一下 和B站 的 bfs 是同一个吗??
add empty file process (errors.RetNeedleIsEmpty)
if size > int64(h.c.NeedleMaxSize) {
res["ret"] = errors.RetNeedleTooLarge
return
}
if size == 0 {
res["ret"] = errors.RetNeedleIsEmpty
return
}
I got error when building bfs/store
./fadvise_linux.go:26: cannot use C.__off_t(off) (type C.__off_t) as type C.off_t in argument to _Cfunc_posix_fadvise
./fadvise_linux.go:26: cannot use C.__off_t(size) (type C.__off_t) as type C.off_t in argument to _Cfunc_posix_fadvise
if errno = int(C.posix_fadvise(C.int(fd), C.__off_t(off), C.__off_t(size), C.int(advise))); errno != 0 {
err = syscall.Errno(errno)
}
https://github.com/Terry-Mao/bfs/blob/master/proxy/bfs/bfs.go#L99
这里是遍历store取文件的操作是处于什么考虑呢?
I got error when building bfs/store
./fadvise_linux.go:26: cannot use C.__off_t(off) (type C.__off_t) as type C.off_t in argument to _Cfunc_posix_fadvise
./fadvise_linux.go:26: cannot use C.__off_t(size) (type C.__off_t) as type C.off_t in argument to _Cfunc_posix_fadvise
// if change C.__off_t to C.off_t, it works
if errno = int(C.posix_fadvise(C.int(fd), C.__off_t(off), C.__off_t(size), C.int(advise))); errno != 0 {
err = syscall.Errno(errno)
}
distributed file system(small file storage) writen in
golang by Terry
.
不好意思,我不是来吹毛求疵来了,就是随手提个issue
😺
which one do you want to choose?
目前所有的卷 和 机器 和组信息都存放在zk 里面,如果这些信息要是足够多的话,zk 是不是会成为瓶颈?我对zk 了解得不够深入,无法完全评估,所有想咨询一下你们的考虑。
api文档等好多文档链接失效了
pitchfork健康检查状态异常
访问地址 http://127.0.0.1:9001,报如下错误
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1836, in call
return self.wsgi_app(environ, start_response)
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1820, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1403, in handle_exception
reraise(exc_type, exc_value, tb)
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1817, in wsgi_app
response = self.full_dispatch_request()
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1477, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1381, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1475, in full_dispatch_request
rv = self.dispatch_request()
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1461, in dispatch_request
return self.view_functionsrule.endpoint
File "/Users/xxx/GoProjects/src/bfs/ops/bfsOps/views/ops.py", line 17, in home
return render_template('index.html')
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/templating.py", line 127, in render_template
return _render(ctx.app.jinja_env.get_or_select_template(template_name_or_list),
File "/Library/Python/2.7/site-packages/Jinja2-2.8-py2.7.egg/jinja2/environment.py", line 851, in get_or_select_template
return self.get_template(template_name_or_list, parent, globals)
File "/Library/Python/2.7/site-packages/Jinja2-2.8-py2.7.egg/jinja2/environment.py", line 812, in get_template
return self._load_template(name, self.make_globals(globals))
File "/Library/Python/2.7/site-packages/Jinja2-2.8-py2.7.egg/jinja2/environment.py", line 774, in _load_template
cache_key = self.loader.get_source(self, name)[1]
File "/Library/Python/2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/templating.py", line 64, in get_source
raise TemplateNotFound(template)
TemplateNotFound: index.html
这两天测试bfs,发现directory模块连接hbase 出现错误如下:
read tcp 192.168.1.23:60518->192.168.1.193:9090: i/o timeout
为定位问题在hbase_test.go中加了两个TestCase,发现从bfs pool 获取的连接会出现此问题,直接用thrift api 则不会
import (
"bfs/directory/conf"
hbase "bfs/directory/hbase/hbasethrift"
"bfs/libs/meta"
"fmt"
"git.apache.org/thrift.git/lib/go/thrift"
"net"
"os"
"testing"
"time"
)
func TestOriginal(t testing.T) {
logformatstr := "----%s\n"
logformatstr := "----%s 用时:%d-%d=%d毫秒\n\n"
logformattitle := "建立连接"
rowkey := "vim.gif"
temptable := "bucket_test"
protocolFactory := thrift.NewTBinaryProtocolFactoryDefault()
transport, err := thrift.NewTSocketTimeout(net.JoinHostPort(HOST, PORT), 10_time.Second)
if err != nil {
fmt.Fprintln(os.Stderr, "error resolving address:", err)
os.Exit(1)
}
client := hbase.NewTHBaseServiceClientFactory(transport, protocolFactory)
if err := transport.Open(); err != nil {
fmt.Fprintln(os.Stderr, "Error opening socket to "+HOST+":"+PORT, " ", err)
os.Exit(1)
}
//client, err := hbasePool.Get()
//if err != nil {
// fmt.Println(err)
//}
//--------------Exists
logformattitle = "调用Exists方法"
fmt.Printf(logformatstr_, logformattitle)
tmpstartTime := currentTimeMillis()
//
isexists, err := (client.Exists([]byte(temptable), &hbase.TGet{Row: []byte(rowkey)}))
fmt.Printf("rowkey{%s} in table{%s} Exists:%t\t", rowkey, temptable, isexists)
if err != nil {
fmt.Printf("Exists err:%s\n", err)
}
fmt.Println("")
tmpendTime := currentTimeMillis()
fmt.Printf(logformatstr, logformattitle, tmpendTime, tmpstartTime, (tmpendTime - tmpstartTime))
}
func TestPoll(t *testing.T) {
logformatstr_ := "----%s\n"
logformatstr := "----%s 用时:%d-%d=%d毫秒\n\n"
logformattitle := "建立连接"
rowkey := "vim.gif"
temptable := "bucket_test"
client, err := hbasePool.Get()
if err != nil {
fmt.Println(err)
}
//--------------Exists
logformattitle = "调用Exists方法"
fmt.Printf(logformatstr_, logformattitle)
tmpstartTime := currentTimeMillis()
//
isexists, err := (client.Exists([]byte(temptable), &hbase.TGet{Row: []byte(rowkey)}))
fmt.Printf("rowkey{%s} in table{%s} Exists:%t\t", rowkey, temptable, isexists)
if err != nil {
fmt.Printf("Exists err:%s\n", err)
}
fmt.Println("")
tmpendTime := currentTimeMillis()
fmt.Printf(logformatstr, logformattitle, tmpendTime, tmpstartTime, (tmpendTime - tmpstartTime))
}
func currentTimeMillis() int64 {
return time.Now().UnixNano() / 1000000
}
运行结果:
=== RUN TestOriginal
----调用Exists方法
rowkey{vim.gif} in table{bucket_test} Exists:false
----调用Exists方法 用时:1461205438652-1461205438636=16毫秒
--- PASS: TestOriginal (0.04s)
=== RUN TestPoll
----调用Exists方法
rowkey{vim.gif} in table{bucket_test} Exists:false Exists err:read tcp 192.168.1.23:60518->192.168.1.193:9090: i/o timeout
----调用Exists方法 用时:1461205439683-1461205438682=1001毫秒
--- PASS: TestPoll (1.03s)
Store 状态为只读状态,因为129这个volume offset错误引起的,不知道怎么恢复这种情况。下边是日志和相关代码。
I0112 12:13:54.480229 95637 pitchfork.go:202] block: /data/bfshome/store/data/129_0, offset: 4294966850
for _, volume = range volumes {
if volume.Block.LastErr != nil {
log.Infof("get store block.lastErr:%s host:%s", volume.Block.LastErr, store.Stat)
store.Status = meta.StoreStatusFail
break
} else if volume.Block.Full() {
log.Infof("block: %s, offset: %d", volume.Block.File, volume.Block.Offset)
store.Status = meta.StoreStatusRead
}
if err = p.zk.SetVolumeState(volume); err != nil {
log.Errorf("zk.SetVolumeState() error(%v)", err)
}
}
NewCofnig("./store.toml") error(Type mismatch for 'conf.Config.Limit' : Type mismatch for 'conf.Limit.Delete' : Type mismatch for 'conf.Rate.Rate' : Expected float but found 'int64'.)
貌似是更新了config.go 之后,原来的 store.toml 未更新,是否需要一并更新?
// InitBucket
// todo get data from db
func InitBucket() (b map[string]Bucket, err error) {
var (
item Bucket
)
b = make(map[string]Bucket)
// bucket test
item.BucketName = "test"
item.Property = 0
item.KeyId = "121bce6492eba701"
item.KeySecret = "1eb80603e85842542f9736eb13b7e1"
b["test"] = item
return
}
您好,了解下,在打开super block文件,用于写needles时,为什么没有使用o_append,是考虑的store是单进程不存在并发写的原因吗?谢谢。
func NewSuperBlock(file string, c *conf.Config) (b *SuperBlock, err error) {
b = &SuperBlock{}
b.conf = c
b.File = file
b.closed = false
b.write = 0
b.syncOffset = 0
b.Padding = needle.PaddingSize
if b.w, err = os.OpenFile(file, os.O_WRONLY|os.O_CREATE|myos.O_NOATIME, 0664); err != nil {
log.Errorf("os.OpenFile("%s") error(%v)", file, err)
b.Close()
return nil, err
}
if b.r, err = os.OpenFile(file, os.O_RDONLY|myos.O_NOATIME, 0664); err != nil {
log.Errorf("os.OpenFile("%s") error(%v)", file, err)
b.Close()
return nil, err
}
if err = b.init(); err != nil {
log.Errorf("block: %s init() error(%v)", file, err)
b.Close()
return nil, err
}
return
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.