Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data repo support for purging unreferenced indexes and objects #8054

Closed
Tracked by #8055
88250 opened this issue Apr 20, 2023 · 4 comments
Closed
Tracked by #8055

Data repo support for purging unreferenced indexes and objects #8054

88250 opened this issue Apr 20, 2023 · 4 comments
Assignees
Milestone

Comments

@88250
Copy link
Member

88250 commented Apr 20, 2023

All referenced data snapshots are recorded under the workspace /repo/refs/ folder.

If the snapshot index in repo/indexes is not referenced, it needs to be purged, and the data blocks contained in the unreferenced snapshot will be actually deleted to free up storage space.

Settings - About - Data repo purge

@88250 88250 self-assigned this Apr 20, 2023
@88250 88250 added this to the 2.8.6 milestone Apr 20, 2023
@88250 88250 changed the title Support for cleaning up unreferenced data snapshots Data repo support for cleaning up unreferenced indexes and objects Apr 20, 2023
@88250 88250 changed the title Data repo support for cleaning up unreferenced indexes and objects Data repo support for purging unreferenced indexes and objects Apr 20, 2023
@88250 88250 closed this as completed Apr 20, 2023
@zxhd863943427
Copy link
Contributor

我其实就是想问,什么情况下数据快照会被记录在工作空间 /repo/refs/ 文件夹下,这个判断标准是什么?
目前的解释尚不清楚。

如果可以的话,也可以解释一下,清除未索引的对象能保证可见的数据快照还能完整回溯吗?因为这些对象都是来自数据快照的建立吧,假如数据快照需要始终可用,而且在多端一致,删除对象可能导致数据快照失效。

@88250
Copy link
Member Author

88250 commented Apr 21, 2023

在手动创建数据快照或者同步自动创建数据快照后 refs/latest 会指向这个快照,这样至少就有一个引用了。

refs/tags/ 下面的是手动标记快照时指向的。

数据仓库清理功能会删除没有被指向的快照和这个快照关联的数据对象。比如执行清理前存在 100 个快照(indexes 文件夹下面有 100 个文件),在清理以后只剩下 1 个 latest 快照,只有这 1 个快照关联的数据对象会保留,其他数据对象会被一并清理删除。

@zxhd863943427
Copy link
Contributor

也就是说,被引用的定义是被标记的快照和最后同步自动创建时的快照?那我觉得还行。

不过感觉这个清理功能最好能划定范围,一次性全部清除有点太大力出奇迹了。

@88250
Copy link
Member Author

88250 commented Apr 21, 2023

一次性清理干净吧。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants