💟 Archived View for kamalatta.ddnss.de › cwfs2.txt captured on 2021-11-30 at 20:18:30.

View Raw

More Information

➡ Next capture (2024-07-08)

-=-=-=-=-=-=-

Logoaddress
Study of cwfs
table of contents

    1.0.0 What is cwfs
    2.0.0 cwfs console
    2.1.0 Overview
    2.2.0 backup of fsworm
    3.0.0 Structure of fsworm
    3.1.0 Block
    3.2.0 dump stack
    3.3.0 Super Block
    3.4.0 Directory Entry Block
    3.5.0 Indirect Block
    3.6.0 File Block
    3.7.0 Fsworm Root
    3.8.0 Dump order
    3.9.0 qid
    4.0.0 Structure of fscache
    4.1.0 Config Block
    4.2.0 Tcache Block
    4.3.0 Mapping
    4.3.1 bucket
    4.3.2 cache entry
    4.3.3 age
    4.3.4 state
    4.4.0 free block and dirty block
    4.5.0 msize and csize
    4.6.0 Map from Worm to Cache
    4.7.0 msize determination algorithm
    4.8.0 Fscache Root
    5.0.0 Recovery
    5.1.0 recovery
    5.2.0 Review on recovery
    5.3.0 What is lost by Recovery
    6.0.0 Other Configurations
    6.1.0 pseudo-RAID 1
    6.2.0 fake WORM
    6.2.1 Creating fake WORM
    6.3.0 Tvirgo
    7.0.0 Misc.
    7.1.0 What did I do that day?
    7.2.0 atime
    7.2.1 Plan 9
    7.2.2 Linux
    7.2.3 OSX
    8.0.0 cwstudy
    8.1.0 usage
    References 

2012/10/20
2013/03/14 renewal
Updated on March 20, 2013
Updated 2013/04/02
Updated 2013/04/10
2013/04/24 Update

For the summer vacation in 2012, I was trying 9front's cwfs. cwfs consists of fscache and fsworm, but we decided to look it up from fsworm. The reason why you are more interested in fsworm is that information such as file history is placed in fsworm and recovery when fscache crashes is done based on fsworm.
Of course, any device eventually will die. Backup is necessary for fsworm. It may be possible to put cwfs itself in RAID, but it's too overkill for me. Add another HDD and want to have a copy of fsworm there. So, how can we copy it in a short time? I wanted to elucidate this problem.

The cwfs attached to 9front is cwfs64x. Therefore, we will explain it based on cwfs64x.

Note: This article is just my memo so far. Of course, it may contain errors. I would be pleased if you let me know if you find an error.

What is cwfs

This section is incomplete. I will write it in my spare time.

Figure 1. layer

Sean Quinlan (Ph.D)
Ken Thompson (Ken's fs)
Plan 9 2nd edition (1995)
Geoff Collyer (2007/03/27 mail)
9front (2011)
Past History
WORM (write once read many)
It will not disappear (can not be erased)
dump

Comparison with Mac / OSX time machine
pdumpfs

Previously we had one assigned for the server.
memory: Memory in the dedicated file server
disk: disk cache
WORM: Optical disc

Currently it runs under Plan 9 user program cwfs
Advantages of being a user program: It is easy to check the mechanism.
memory: memory in cwfs daemon
disk: fscache partition of file server
WORM: File server fsworm partition

cwfs (or cwfs32)
cwfs64
cwfs64x

cwfs console

Overview

The size of my partition on cwfs is as follows.

  term% ls - l / dev / sdC 0 / fs *
 - rw - r ---- - S 0 arisawa arisawa 22848146432 Sep 13 09: 50 / dev / sdC 0 / fscache
 - rw - r ----- S 0 arisawa arisawa 114240734208 Sep 13 09: 50 / dev / sdC 0 / fsworm
 term%

The cwfs console uses commands

  con -C /srv/cwfs.cmd

You will be able to use with.

Note: Create one new window

  cat /srv/cwfs.cmd

Run it from another window

  echo statw >> /srv/cwfs.cmd

May be executed. Distribution seems to assume this method.

Running statw on the cwfs console will output usage statistics for fsworm. Next, an example of the output is shown.

  > statw
 cwstats main
	 filesys main
		 maddr = 3
		 msize = 10313
		 caddr = 1035
		 csize = 1392255
		 sbaddr = 1755217
		 craddr = 1755389 1755389
		 roaddr = 1755392 1755392
		 fsize = 1755394 1755394 0 + 25%
		 slast = 1754642
		 snext = 1755393
		 wmax = 1755392 0 + 25%
		 wsize = 6972701 1 + 0%
		   8600 none
		 230005 dirty
		      0 dump
		 1153555 read
		     95 write
		      0 dump1
		 cache 17% full
 >

Figure 2. Example of statw output from the cwfs console.
The prompt " > " is not issued in the official distribution version.
I am trying to go out with a modified version by the author.

The numbers on the right of the equal sign ( maddr to wsize ) (the first and second columns) represent the size. The information of the number in the first column is obtained from Tcache block (block address 2) of Tcache . The second column is obtained from fsworm Note 1 . These units are block except for msize . Only msize is a bucket number.

Note 1: This part is incorrect. The second row is also obtained from fscache. The cache of the last super block of fsworm is included in fscache and this is displayed. fsize for fsize , the information in the second column matches the information obtained from the last super block. fsize is the value at the time of executing the statw command, which is consistent with fsize obtained from the Tcache block of Tcache . (2013/04/24)

Notice that fsworm does not include information on fsworm's partition size. It may be replaced by a larger partition when fsworm is full, so in that case it would be designed to just copy the contents of the full partition to the new partition .

Both fscache and fsworm are managed in block units. In the case of cwfs64x, 1 block is 16 KB. Hereinafter, block is represented by an address. ba[0] means the first block, and ba[1] means the second block.

wsize 6972701 /dev/sdC0/fsworm the size of /dev/sdC0/fsworm the number of blocks.

  6972701 * 16 * 1024 == 114240733184

This is 1024 less than the actual size of /dev/sdC0/fsworm , but since it is being used as a unit, an unusable area has occurred.

It should be noted that there is no concept of format in fsworm. Because it is Write Once, after formatting it can not write data! 3

It is written to fsworm in order from the top block. snext (next super block) indicates the next block to be written. When you dump, a block called super block is written first, followed by the body of the data. Super block serves as a boundary for every dump and plays a fundamental role in fsworm.

sbaddr (super block address) is the address of the super block written last. Also, slast is the address of the preceding super block.

Note 2: The block managed by cwfs is different from the block as the unit of reading / writing managed by the driver.
Note 3: WORM in its original meaning will not be used recently. The significance of the existence of an optical disk as a backup medium has disappeared, substituting WORM for a hard task is low cost and easy to use. The following explanation also assumes the use of a hard disk.

Backup of fsworm
Revised 2013/03/06

Although fsworm is very robust (in its mechanism), it still has the risk of data loss due to hardware crashes. To do the backup you have a copy of fsworm (eg fsworm-bak).

For each dump, you might think that you can copy only the newly created block in fsworm, but the story is not that easy. This is because unwritten blocks exist under the last dumped super block. In my observation there are two kinds of them.
(a) reserved blockNote 1
(b) free block
.
(a) is a block of fsworm that is used in fscache, but has not yet completed dump.
(b) is a block of fsworm that fscache can use in the future.

Note 1: "reserved block" is a term that I am using arbitrarily. This term does not exist in the literature.

Structure of fsworm

Block

Both fscache and fsworm are managed in units of block. In the case of cwfs64x, 1 block is 16 KB. Hereinafter, block is represented by an address. ba[0] means the first block and ba[1] means the second block. Together, the first two blocks are special. The address of block is specified by 8 B data. In the program, the data type is indicated by Off .

In the case of fscache, ba[0] contains config information. ba[1] seems not to be used.
In the case of fsworm, there is no indication that the first 2 blocks are used.

Each block has Tag . However, not all the blocks are formatted.

The size of Tag is 12 B in the case of cwfs 64 x, and it is placed at the end of the block. Its structure is as follows.

  struct Tag
 {
	 short pad; / * make tag end at a long boundary * /
	 short tag;
	 Off path;
 };

The value of pad is 0. tag indicates the type of block. path way of thinking about the value of path is different for each tag .

The area excluding Tag 12B is a data area (hereinafter referred to as Data) from 16 KB and has a structure for each type of block. To summarize

  block = Data + Tag

.

Note 1: Recent HD has large capacity. Formatting all the blocks takes a huge amount of time. In addition, the WORM device can not write if formatted!

dump stack
2013/02/28

When dump is repeated, dumped data is stacked on fsworm. Data will never be overwritten. The situation is shown in Fig. 3 (left).

Figure 3. dump

Hereinafter, the block of block address n is represented by ba[n] . ba[0] and ba[1] are not used. When cwfs is executed, first three blocks

  ba [2]: super block
	 ba [3]: cfs root block
	 ba [4]: ​​dump root block

Is made. It does not contain information on files included in fscache.
Note: The words "cfs root block" or "dump root block" do not appear in formal documents or program code. "Cfs root address" and "dump root address" appear.

Blocks are stacked for each dump. We call the block that is stacked with one dump simply as dump. One dump always includes three blocks, "super block", "cfs root block", "dump root block". Figure 3 (right) shows the internal structure of dump.

In the block between super block and cfs root block, the content of fscache is saved (only the difference updated from the previous dump).

The block between the cfs root block and the dump root block contains information on the dumped date. The required number of blocks depends on the number of dumps.

Super Block

The most basic thing to capture the structure of fsworm is super block. The tag of super block is Tsuper (= 1). When we read the code of cwfs, the path the Tag of super block is QPSUPER (= 2).

One super block is created when you dump fscache. First, super block is written to fsworm, then the body of the data is written block by block (right in Figure 3).

The structure of super block

  struct Superb
 {
	 Fbuf fbuf;
	 Super 1;
 };

have.

Fbuf has an array of addresses of free blocks inside. I will postpone commentary on free block.

Super1 is important.

  struct Super 1
 {
	 Off fstart;
	 Off fsize;
	 Off tfree;
	 Off qidgen; / * generator for unique ids * /
	 / *
	  * Stuff for WWC device
	  * /
	 Off cwraddr; / * cfs root addr * /
	 Off roraddr; / * dump root addr * /
	 Off last; / * last super block addr * /
	 Off next; / * next super block addr * /
 };

Figure 4. Structure of Super1

As can be seen, each super block has the address ( next ) of the next super block. And the first super block starts with ba[2] . Therefore, if you follow the super block from ba[2] order, you will know the address to be dumped next. Next, an example of the output result is shown. (This tool will be introduced later)

  super blocks:
 2
 Five
 69908
 85793
 104695
 222009
 ...
 1751346
 1754278
 1754381
 1754569
 1754642
 1755217
 1755393

The last 1755393 is the super block address to be made next. The contents of Super1 ba[1755217] are (for example) as follows.

  super1 fstart: 2
 super1 fsize: 1755394
 super1 tfree: 92
 super1 qidgen: 6d76e
 super1 cwraddr: 1755389
 super1 roraddr: 1755392
 super1 last: 1754642
 super1 next: 1755393

Some of these information is also obtained from the cwfs console.

sbaddr 1755217 : current super block (last written super block)
snext 1755393 : The next dump schedule address (that is, the super block address to be created next)
slast 1754642 : super block address one before the sbaddr

Directory Entry Block
Updated 2013/03/02

Next is the directory entry block. Tag.tag of this block is Tdir . Also, Tag.path matches the Tag.path of the parent directory.

The directory entry block contains one or more of the following directory entries (Dentry) (up to 62 in the case of cwfs64x).

  struct Dentry
 {
	 char name [NAMELEN];
	 Userid uid;
	 Userid gid;
	 ushort mode;
		 #define DALLOC 0x8000
		 #define DDIR 0x4000
		 #define DAPND 0x2000
		 #define DLOCK 0x1000
		 #define DTMP 0x0800
		 #define DREAD 0x4
		 #define DWRITE 0x2
		 #define DEXEC 0x1
	 Userid muid;
	 Qid 9 p 1 qid;
	 Off size;
	 Off dblock [NDBLOCK];
	 Off iblocks [NIBLOCK];
	 long atime;
	 long mtime;
 };

Figure 5. Directory entry

Since the names of files and directories are included here, the size of NAMELEN-1 depends on the allowable name length ( NAMELEN-1 ).

The dump root block is one of directory entry blocks. In the case of my system at home, you can see the dumped date as follows.

  term% ls / n / dump
 / n / dump / 2012/0801
 / n / dump / 2012/0802
 / n / dump / 2012/0804
 / n / dump / 2012/0813
 ....
 / n / dump / 2013/0121
 / n / dump / 2013/0127
 / n / dump / 2013/0128
 / n / dump / 2013/0205
 ....

The first line is generated for the first time on dump (August 1, 2012).

  ls - l / n / dump / 2012/0801

You can access the files on this day as follows.

  maia% ls - l / n / dump / 2012/0801
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801/386
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801/68000
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801/68020
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801 / acme
 d-rwxrwxr-x M 495 adm adm 0 Jul 31 2012 / n / dump / 2012/0801 / adm
 ....
 d-rwxrwxr-x M 495 sys sys 0 Jan 18 2012 / n / dump / 2012/0801 / mnt
 d-rwxrwxr-x M 495 sys sys 0 Jan 18 2012 / n / dump / 2012/0801 / n
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801 / power
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801 / power64
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801 / rc
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801 / sparc
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801 / sparc 64
 d-rwxrwxr-x M 495 sys sys 0 Jul 31 2012 / n / dump / 2012/0801 / sys
 dr-xr-xr-x M 495 sys sys 0 Jan 18 2012 / n / dump / 2012/0801 / tmp
 d-rwxrwxr-x M 495 sys sys 0 Aug 1 2012 / n / dump / 2012/0801 / usr
 maia%

Figure 6. ls / n / dump

The dumped date information ( YYYY/MMDD ) is between dump root address and cfs root address. (Figure 3)

In this case, blocks are connected as shown in the next figure 7.

Figure 7. Connection of directory entry block

A rectangle represents one block. In this figure, none is a directory entry block. Since the number of dumps is still small, the date information of 2013 can be made in time with one directory entry block, but multiple blocks will be required within the time.

In Plan 9, as you can see from the Dentry structure in Figure 5, the name of directory and the information such as mode live in the same block. On UNIX, on the other hand, information such as mode is placed in the inode and is a block different from the list of names (Fig. 8). The origin of this difference seems to be that in UNIX, in order to support hard link, it is necessary to have the link counter in another block (specifically, inode) different from the name.

Figure 8. In the conceptual diagram of unix inode, we write contents as contents of file or other directory.

Qid9p1 's Qid9p1 structure

  struct Qid 9 p 1
 {
	 Off path; / * was long * /
	 ulong version; / * should be Off * /
 };

However, this path is set to 1 in the first bit when mode is a directory (that is, when mode&DDIR != 0 ). (I'm not sure about the reason why I designed this way.) The official qid, ie the command

  ls - ql

The qid displayed is the one with the first bit of this qid.path , that is, qid.path&~DDIR .

In the case of cwfs 64 x, one Dentry is 260 B. Therefore, one block can hold up to 62 Dentry .

name contains the file name and directory name. NAMELEN is 144 in the case of cwfs NAMELEN x. Since the name ends with '\0' , the maximum length of the name is 143 characters. In addition to the name, basic information for directories and files is included in this.

Dentry mode&DDIR == 0 represents a file ( mode&DDIR == 0 ), the block dblock[NDBLOCK] the file contents can be traced based on direct block ( dblock[NDBLOCK] ) and indirect block ( iblocks[NIBLOCK] ). The block containing the file contents is tagged with Tfile .

Dentry mode&DDIR != 0 is a directory ( mode&DDIR != 0 ), the mode&DDIR != 0 in which the directory contents (the information of the files and directories contained therein) is placed is set to direct block ( dblock[NDBLOCK] ) and indirect block ( iblocks[NIBLOCK] ). The block containing the directory contents is tagged with Tdir .

In the case of NDBLOCK the value of NDBLOCK is 6 and the value of NIBLOCK is 4.

Data is directly written to 6 direct blocks. Because 16 * 1024 - 12 B data can be written in one direct block, 6 direct blocks can be written with a total of 6 * (16 * 1024 - 12) B data. In the case of a directory, up to 372 (= 62 * 6) Dentry can be handled.

Indirect Block

For the block indicated by iblocks[0] included in the iblocks[0] structure of the directory entry block,

  (16 * 1024 - 12) / 8 = 2046

Block addresses. Here, 8 is the size of one block address. ) These block addresses indicate the location of the data (ie direct block). Therefore

  2046 * (16 * 1024 - 12) = 33497112 B

Can be written. Let's say that such block information is the primary indirect block.

iblocks[1] , a secondary indirect block is written. In other words, the block addresses are written in this block, but these addresses are the primary indirect block addresses (not the data location). Therefore

  2046 * 2046 * (16 * 1024 - 12) = 68535091152 B

Can be written.

Likewise, iblocks[2] has

  2046 * 2046 * 2046 * (16 * 1024 - 12) = 140222796496992 B

And iblocks[3] has

  2046 * 2046 * 2046 * 2046 * (16 * 1024 - 12) = 286895841632845632 B

.

The tag of indirect block

  iblocks [0] Tind 1
	 iblocks [1] Tind 2
	 iblocks [2] Tind 3
	 iblocks [3] Tind 4

. Also, any of these Tag.paths matches the qid of the parent directory.

File Block

The block containing the contents of the file is tagged with Tfile . Tag.path of this block matches the qid of the directory to which this file belongs.

One file block has a maximum

  16 * 1024 - 12 B

File data can be saved.

The contents of the file are managed in block unit. Will all blocks be rewritten when the contents of the file is updated? According to my experiment, it is not. If you add data to the end of a file larger than one block, only the last block is updated. Other blocks are used as they are. However, this experiment
(a) The append attribute is specified in the file
(b) writing at the end of the file by seeking at the end of the experiment under either condition.

If I open a large file larger than 16 KB with a text editor and add data to the end, I think that it can be completely rewritten. (I have not done an experiment though ...)

Characteristics of cwfs in which only rewritten blocks are newly generated is particularly important in servers. On my server, the log file of web server is 1.7 GB.

  1757143424 Oct 18 17: 13 http

It will not go on to keep copying files of this size every day1.

Note 1: On Mac / OSX's Time Machine or Linux pdumpfs, if the file has been updated, a new copy of it will be made. If we try to realize the function of TimeMachine with only hard links, copy will be inevitable specification. Since the server carries a large log file which is updated day by day, it should not be suitable for server use at all. In the case of database files, even cwfs should be removed from daily dump. It is safer to log transactions.

Fsworm Root
2013/03/09

All information on fsworm can be traced from the dump root block at the top of the dump stack. You can find out this address from roaddr of cwfs console. Figures 6 and 7 are the first part of the path seen from here.

Actually roaddr is a root block managed by fscache, but it matches dump root block of fsworm.

Dump order

The dump is done based on the current fscache. First, super block is written to fsworm. Following this, the information at the end of the directory tree is written in order. Thus, in fsworm, for example

  / 2012/0925 / ....

Is created, the last / before 2012 , before 0925 before that ...

qid

From the user, you can see the qid of the file or directory by adding q option to ls command. For example

  maia% ls - ql
 (000000000009 baa2 6 00) - rw - rw - r - M 326 web web 33597 Mar 8 15: 01 bucket.png
 (000000000009 baa 3 300) - rw - rw - r - M 326 web web 13693 Mar 8 15: 02 bucket.svg
 (0000000000089b8c 2 00) - rw - rw - r - M 326 arisawa web 782 Sep 28 10: 11 console.txt
 (0000000000089b8d 2 00) - rw - rw - r - M 326 arisawa web 2401 Oct 15 21:21 cwfs.svg
 ...
 maia%

As shown in FIG. The part of hexadecimal notation in the head () is qid, and the next digit is qid version.
According to the manual qid is unique within the file system.

If it is unique, the qid must be managed. It seems that qidgen in super block is there for that (Figure 4).

You can tell by experimenting but qid does not change by changing file name. Version changes when content changes.
So, when creating an editor, you can use it to know whether or not you have changed by something else at the time of saving, but time stamp is easier, so I used qid so far There is nothing. (The qid of unix seems to be different)

Except in fsworm and fscache, qid and its version are included in the directory entry (Figure 5), and you can see that the block concerning the contents is the same qid. That is, it seems to be used to confirm the affiliation of block.

Structure of fscache

  ba [0] config
	 ba [1] -
	 ba [2] Tcache

  ba [maddr] map
	 ...
	 ba [caddr] cache
	 ...

Config Block
2013/03/05

As far as cwfs64x is concerned, the following data was written in the text format from the beginning in my config block ( ba[0] ). (This content can also be seen with the printconf command of cwfs console)

  service cwfs
 filsys main c (/ dev / sdC 0 / fscache) (/ dev / sdC 0 / fsworm)
 filsys dump o
 filsys other (/ dev / sdC 0 / other)
 noauth
 newcache
 blocksize 16384
 daddrbits 64
 indirblks 4
 dirblks 6
 namelen 144

noauth means allowing access to cwfs without authentication. noauth should be noted that noauth is a special setting only allowed at the experimental level in a secure environment.
What I use at university is the February version of this year, which is not noauth . (2013/04/10)

In addition, this block is tagged as follows.

  pad: 0000
	 tag: 10 (Tconfig)
	 path: 0

The source code has the following structured data.

  struct Conf
 {
	 ulong nmach; / * processors * /
	 ulong nuid; / * distinct uids * /
	 ulong nserve; / * server processes * /
	 ulong nfile; / * number of fid - system wide * /
	 ulong nwpath; / * number of active paths, derived from nfile * /
	 ulong gidspace; / * space for gid names - derived from nuid * /

	 ulong nlgmsg; / * number of large message buffers * /
	 ulong nsmmsg; / * number of small message buffers * /

	 Off recovcw; / * recover addresses * /
	 Off recovro;
	 Off firstsb;
	 Off recovsb;

	 ulong configfirst; / * configure before starting normal operation * /
	 char * confdev;
	 char * devmap; / * name of config-> file device mapping file * /

	 uchar nodump; / * no periodic dumps * /
	 uchar dumpreread; / * read and compare in dump copy * /
	 uchar newcache;
 };

Data in this is given (in the source code) for each type of cwfs during the initialization process.

Tcache Block

Tcache block manages basic information about cwfs. You can see this in the cwfs console.

  struct Cache
 {
	 Off maddr; / * cache map addr * /
	 Off msize; / * cache map size in buckets * /
	 Off caddr; / * cache addr * /
	 Off csize; / * cache size * /
	 Off fsize; / * current size of worm * /
	 Off wsize; / * max size of the worm * /
	 Off wmax; / * highwater write * /

	 Off sbaddr; / * super block addr * /
	 Off cwraddr; / * cw root addr * /
	 Off roraddr; / * dump root addr * /

	 Timet toytime; / * somewhere convienent * /
	 Timet time;
 };

Each block of fscache is cached in memory. Every ten seconds the cache of memory is written to fscache (if there is an update).

Mapping
Updated 2013/03/08

Each block of fsworm is mapped to the cache block of the cache area of ​​FIG.

Let cba be the cache block address of cba , cba

  caddr <= cba <caddr + csize

. A block of fsworm 0 to wsize is mapped to this area of ​​fscache.

The total number of cache blocks is indicated by csize (in the cwfs console). caddr all remaining caddr starting with caddr are not cache blocks.

In the example of my system, the number of blocks of fsworm is 6972701, whereas the number of cache blocks of fscache is 1392255. Therefore, it has about 1/5 cache capacity. Also, 13604540 blocks can be taken in fscache, but only 1035 + 1392255 (= 1393290) are actually used. The unused area is about 0.1%.

Just thinking about it might think of mapping as shown in Table 1.

Table 1. a simple but problematic mapping from fsworm to cache

Here, cache is written in fscache's cache area. The address shown is counted from caddr .

However, this mapping suffers from problems. If an fsworm block is cached, there may be cases where other fsworm blocks that are mapped to the same cache block can not enter cache.

So cwfs keeps bucket in between, giving flexibility to mapping. The situation is shown in Table 2.

Table 2. real mapping implementation of cwfs.

For ease of explanation, block address of fsworm is set to wa and cache address counted from caddr to ca wa%msize that wa%msize with the same wa%msize is ca%msize to ca of ca%msize . The state of actual mapping is managed by bucket in map block of fscache.

bucket

A block from caddr to caddr in maddr is a map block, and its tag is Tbuck . The map block is a collection of buckets, and the bucket contains the state of cache and the block address of the corresponding fsworm.

Each bucket has the following structure.

  struct Bucket
 {
	 long agegen; / * generator for ages in this bkt * /
	 Centry entry [CEPERBK];
 };

Each map block can hold up to BKPERBLK (= 10) BKPERBLK .

Figure 9. Structure of fscache (cwfs64x)

The total number of msize included in msize is given by msize . In the following, bucket is numbered from 0 to msize - 1 in ascending order of block address, and it is called bucket address.

cache entry

Each bucket has CEPERBK (= 135) Cache Entries. Each Cache Entry represents the state of the current map.

  struct Centry
 {
	 ushort age;
	 short state;
	 Off waddr; / * worm addr * /
 };

In order to find the block address cba of the corresponding fscache from the block address ba cba , first

  bn = ba% msize

. Then check the waddr of 135 cache entries in bucket[bn] . If the block of block address ba of fsworm is cached, there should be some matching ba it. Letting this cache entry be entry[ce]

  cba = msize * ce + bn + caddr

Can be obtained with. Details are explained in "Map from Worm to Cache".

Conversely, to obtain the block address ba of cba from the block address cba fscache currently being cached, first

  bn = (cba - caddr)% msize # bucket addr
	 ce = (cba - caddr) / msize # entry addr

The bucket address bn can be found by looking up the waddr of the cache entry ce in it.

The cache area block of fscache has "state" by cache entry. In the following, let's say that the state of the cache block is ◯◯ or the worm address corresponding to the cache block is ◯◯.

age

age represents the age of the cache block. Small values ​​are old. So it is conceptually close to birth day rather than age. When it becomes necessary to cache newly, the small cache of age is preferentially discarded (when Cnone does not exist). And the value of agegen of bucket becomes the value of age be newly allocated. If data is allocated to cache, agegen increases by 1. The maximum value MAXAGE=10000 age exists. When it exceeds it, reallocation of age is necessary, but details are not understood because I have not read the code well.

state

The cache state is the state of the corresponding Centry and has the following values.

  Cnone = 0
	 Cdirty = 1
	 Cdump = 2
	 Cread = 3
	 Cwrite = 4
	 Cdump 1 = 5

The state of an Cnone cache block is Cnone to Cnone . In my observation, waddr in this case seems to be garbage and has no meaning.

In the case of cache which reads data from worm, waddr is the address itself of the original worm. The state of cache is Cread . In this case the contents of the cache block are the same as the corresponding block of fsworm. (Naturally because it is a cache)

When an existing directory or file is changed, the Cread tag of that cache changes to Cwrite . And waddr is not changed. Of course it will not go overwrite on dump, so at that time it will be saved in the new worm address and waddr in the cache entry should be updated at the same time. If abnormally dumped, the state will be Cread .

When creating a new directory or file, wba corresponding to that cache block is assigned an unused address of worm. The state is Cdirty . When dump is completed it should be Cread .

The cache of the state Cnone and Cread does not need to reflect the change in waddr . Therefore, it means that the block of fscache corresponding to this Centry may be discarded (if necessary).

By the way, when you run the cwfs console statw command

  8600 none
		 230005 dirty
		      0 dump
	        1153555 read
		     95 write
		      0 dump1

We examine Centry directly and display the distribution (number) of values ​​of state .

free block and dirty block
2013/03/11
2013/03/14 revision
2013/03/29 Correction
2013/04/10 added

It is better to classify as follows.
(a) unwritten block
(b) dirty block
(c) free block

The unwritten block in (a) is an unwritten block existing in the dumped area.
The dirty block in (b) is a block which is called Cdirty in fscache. When cwfs creates a new file or directory, it associates worm's address with cache, and Cdirty cache's state to Cdirty . This worm address is an unused address, and if unwritten block is available, use it.
Immediately after dump

  (a) ⊇ (b)

.
(c) free block is a block that is registered as a free block in fscache among dirty blocks. They are not linked to the file tree. Linked dirty blocks are subject to dump, but blocks deviated from link are not dumped. cwfs はこれらをゎミずしお捚おるのではなく、新たに dirty block を䜜る時に利甚する。
cwfs は free block の list を持っおいる。free block list は supper block の䞭ず、fscache のTfreeで tag 付けられた block に存圚する。supper block およびTfree block には 2037 個の block address 情報を保持できる。
埓っお

 (b) ⊇ (c)

.

fscache の Cwrite の状態の block は、実際に dump される時に初めお dump 先の address が確定する。これらの address は最埌の dump の䞊に積み䞊げられる。他方 Cdirty の状態の block には、取りあえず fsworm の address ず関係付けられる。

そもそも free block など発生させないように巧くやれないのか? いろいろ考えるが、結構難しそうである。file や directory の削陀が問題である。dump が毎日朝5時に行われるずしよう。昌間の䜜業でいく぀かの directory や file を新たに䜜ったずする。それらをC1,C2,...,Cnずする。fscache 内のこれらの block はどれもCdirtyの状態に眮かれ、fsworm の address ず関係付けられる。これらの内のいく぀かはその日の内に削陀されるこずもあるだろう。削陀されたものをD1,D2,...,Dmずする。 C1,C2,...,CnからD1,D2,...,Dmを差し匕いた郚分が dump されるのであるが、dump で生成される fsworm の address が連続しおいる事は期埅しがたく、「穎」を持぀事ずなる。それらは free block ずしお将来の利甚のために予玄されるはずである。

もちろん、削陀した file や directory は盎ちに芪 directory の entry から削陀されるが、その contents の状態はCdirtyのたたになっおいる。

Cdirty問題は僕にはただ分からない事が倚い。僕の fscache は倧量のCdirty block を抱えおいる。どうやら、この状態は異垞らしい。原因は䜕か? 考えられるのは cwfs console の clri コマンドで倧きな directory を削陀したあずに、check free コマンドを行わなかった事。rm コマンドによっおファむルを削陀した堎合には、䞍芁になった block は free block list に入っお行くが、clri の堎合には入らないらしい。それらは利甚されない block ずしお捚おられるらしい。

super block には free block list を持ち、2037 個の free block を登録できる。これを超えた free block の address は、 Tfreeでタグ付けられた fscache の block に蚘録されおいる。 Tfree block は super block ず同様に free block list を持っおいる。free block list の構造䜓は䞡者ずも同じである。

super block やTfree block に含たれる free block をfree[n] ( n=0,1,... ) ずする。゜ヌスプログラムを远っお行くず、 free[0]は特殊であるこずが分かる。これはTfree block ぞの pointer なのである。もっず正確に蚀えば、 free[0]に察応する fscache の address がTfree block になっおいるはずである。(図10)

図10. freelist chaine

msize ず csize

msize個の bucket の䞭にはmsize*CEPERBK個の cache entry が存圚する。筆者のケヌスでは、 msizeが 10313 なので、この蚈算結果は 1392255 ずなる。この数字はcsizeに䞀臎する。 In other words,

 csize = msize*CEPERBK

の関係が成立する。1぀の Cache Entry は 1぀の cache block を管理しおいるのである。

msize個の bucket を収玍するには、

 (msize+BKPERBLK-1)/BKPERBLK

個の block が必芁である。割り算の圢が耇雑なのは、切り䞊げおいるからである。筆者のケヌスでは、 msizeが 10313 なので、この蚈算結果は 1032 ずなる。これにmaddrの 3 を加えお、 caddrの 1035 ず䞀臎する。 In other words,

 caddr = (msize+BKPERBLK-1)/BKPERBLK + maddr

の関係が成立する。

Map from Worm to Cache

bnを bucket address、 ceを、その bucket の䞭の cache entry のアドレスずする。 bnずceは次の範囲にある。

 0 <= bn < msize
	0 <= ce < CEPERBK

bnずceを基に、cache block address を察応させなくおはならない。
2぀の自然な考え方がある。
(a) bn*CEPERBK+ce+caddr
(b) msize*ce+bn+caddr
もちろん他のもっず耇雑なマッピングは考えられるが、それらを採甚する理由は存圚しない。
そしお、実際には埌者の方匏(b)が採甚されおいる。

前 者は採甚しがたい。なぜなら、ファむルは fsworm の連続した block を占める傟向がある。埓っお (a) を採甚したならば、新たに䜜成された倧きなファむルのキャシュ情報は1぀の bucket を占有するこずになる。するずその bucket が管理する cache block には、(fsworm に dump しない限り)新たにキャッシュできなくなる。

さらに埌者の堎合には、fsworm の連続した block をキャッシュした堎合に、fscache でも連続した block になる可胜性が高いずころにある。(ハヌドディスクの seek time が節玄できる。)

msize の決定アルゎリズム

msizeはどのような蚈算で決定されるか?
map block ず cache block の合蚈数をmずするず、
map block をn個にした堎合の可胜なmn (= csize ) の倀は、

 (n-1)*BKPERBLK*CEPERBK < m - n <= n*BKPERBLK*CEPERBK

を満たす必芁がある。 In other words,

 1.0*m/(1 + BKPERBLK*CEPERBK) <= n < 1.0*(m + BKPERBLK*CEPERBK)/(1 + BKPERBLK*CEPERBK)

を共に満たす必芁があるが、そのようなnは

 n = (m + BKPERBLK*CEPERBK)/(1 + BKPERBLK*CEPERBK)

で埗られる。 That is,

 m - n = ((m - 1)*BKPERBLK*CEPERBK)/(1 + BKPERBLK*CEPERBK)

このように蚈算されたmnはCEPERBKの倍数である保蚌が無い。埓っお次の補正を加える必芁がある。

 msize = (mn)/CEPERBK
	csize = msize*CEPERBK
	caddr = (msize + BKPERBLK - 1)/BKPERBLK + maddr

で蚈算される事になろう。

筆者の fscache は

 1394540 block

確保できるので、

 m = 1394540 - 3 = 1394537

.この蚈算方匏によれば

 msize = 10322
	caddr = 1036
	csize = 1393470

ずなり、 caddr + csizeは 1394506 である。これは fscache の block 数 1394540 よりも小さいので、これで良いはずなのであるが、実際の cwfs の倀は違う。実際には、このmsizeをさらに調敎し

 msize = maxprime(msize - 5) # Ken's value
	csize = msize*CEPERBK
	caddr = (msize + BKPERBLK - 1)/BKPERBLK + maddr

ずしおいる( cw.c )。ここにmaxprime(n)は、 nを超えない最倧の玠数である。この調敎が䜕故必芁なのか? 筆者には䞍明である。(fsworm ず fscache ずの関係では、この調敎は䞍芁なはずである。)

Fscache Root
2013/03/09

fsworm が root を持぀ように、fscache も root を持っおいる。(持たなければ directory tree を蟿れない)

fscache の root block の address は fsworm の dump stack top の dump root block の address を基にしお、通垞の mapping rule に埓っお決定されおいる。

Recovery

埩元 (recovery)
2013/02/28 曎新

cwfs に異垞をもたらす原因はいろいろあるが、䞻なケヌスは次の2぀であろう。
(a) 曞き蟌み䞭の停電
(b) ハヌドりェアクラッシュこれらはさらに、様々なケヌスで现分化されるが、ここでは fsworm が健党である(あるいは同じようなこずであるが、fsworm のバックアップが存圚しおいる)こずを仮定する。この堎合には、fsworm に基づいお埩元するこずになる。

以䞋の仮定を眮く:

 /dev/sdC0/fscache
	/dev/sdC0/fsworm

が存圚し、

    fswormは健党であり、
    fscacheの先頭 block には正しい config 情報が含たれおいる。

その元では、cwfs のスタヌトで(9front では)

 bootargs is (tcp, il, local!device)[local!/dev/sdC0/fscache]

のメッセヌゞがでるので

 local!/dev/sdC0/fscache -c

を input し、その埌config:の prompt に察しお

 recover main
end

で応えればよい。(埩元は非垞に早い。(1~2秒?)

fscacheの先頭ブロックは、cwfs の掻動䞭には曞き蟌み察象から倖されおいるので、ハヌドディスクが物理的損傷を受けおいない限り、デヌタのロスは高々、最埌の dump 以降に限られるず蚀える。

fscache の埩元に必芁な党おの情報が fsworm の block address 範囲 0 からsnextたでの䞭に含たれおいる。埩元に際しお、fsworm の党おを調べる必芁は無い。最埌にダンプした蚘録から蟿る事ができる。この䜜業は cwfs が自動的に行うはずであるが、参考のために、fsworm の構造をもう少し詳しく解説する。

Plan9(あるいは9front)では、過去のファむルの状態は

 9fs dump

を実行しお

 /n/dump

以䞋に芋えるが、ここに芋える党おの情報が次のダンプアドレスsnextの1぀前の block アドレス ( = roaddr = snext - 1 ) から簡単に蟿っお行くこずができる。

埌に玹介するプログラム cwstudy は、block アドレスを指定しお、その内容を衚瀺する。次は cwstudy の実行䟋である。

 cpu% cwstudy 1755392
/dev/sdC0/fsworm
tag pad: 0000
tag tag: 11 (Tdir)
tag path: 1

name: /
uid: -1
gid: -1
mode: 0140555
muid: 0
qid path: 80000001
qid ver.: 0
size: 0
dblock: 1755391 0 0 0 0 0
iblock: 0 0 0 0
atime: 1343737574 // Tue Jul 31 21:26:14 JST 2012
mtime: 1343737574 // Tue Jul 31 21:26:14 JST 2012

最初に埗られる名前は “ / ” である。䜜成日は、fsworm が䜜られた 2012幎7月31日ずなっおいる。 dblock[0]の1755391は、" / " の䞋の directory entry block のアドレスである。

 cpu% cwstudy 1755391
/dev/sdC0/fsworm
tag pad: 0000
tag tag: 11 (Tdir)
tag path: 1

name: 2012
uid: -1
gid: -1
mode: 0140555
muid: 0
qid path: 80000001
qid ver.: 27
size: 0
dblock: 1755390 0 0 0 0 0
iblock: 0 0 0 0
atime: 1348729247 // Thu Sep 27 16:00:47 JST 2012
mtime: 1343797238 // Wed Aug 1 14:00:38 JST 2012

block アドレス1755391に含たれるディレクトリの名前は2012である。1぀しか珟れおいないのは、fsworm の運甚開始が2012だからである。

block アドレス1755390には倚数の directory entry が含たれおいる。

 term% cwstudy 1755390
[äž­ç•¥]
name: 0925
uid: -1
gid: -1
mode: 0140555
muid: 0
qid path: 80000001
qid ver.: 27
size: 0
dblock: 1755212 0 0 0 0 0
iblock: 0 0 0 0
atime: 1348584237 // Tue Sep 25 23:43:57 JST 2012
mtime: 1348584237 // Tue Sep 25 23:43:57 JST 2012

name: 0927
uid: -1
gid: -1
mode: 0140555
muid: 0
qid path: 80000001
qid ver.: 27
size: 0
dblock: 1755388 0 0 0 0 0
iblock: 0 0 0 0
atime: 1348729247 // Thu Sep 27 16:00:47 JST 2012
mtime: 1348729247 // Thu Sep 27 16:00:47 JST 2012

それらの名前は、ダンプした月日を衚しおいる。たた、それらは

 ls /n/dump/2012

で衚瀺される名前ず䞀臎する。

さらに進んで、 2012の䞋にある0927の directory entry も同様に芋぀ける事ができる。それらの名前は

 ls /n/dump/2012/0927

で衚瀺される名前ず䞀臎する。そこにはadmやsysやusrなどの名前が芋えるだろう。

0925のdblock[0]は1755212である。この block アドレスは 9月25日にダンプした block の䞭に含たれおいる。(この日には1754642から1755216たでが消費された)

9月27日のダンプでは、この日のファむルを党お新たにコピヌするのではなく、倉曎されおいないコンテンツに関しおは、叀いコンテンツをそのたた䜿う。ここでは0925に関しおは、9月25日のコンテンツがそのたた䜿われおいる。

fsworm では block 単䜍の差分法が䜿われおいるのである。(この件に関しおは埌にたた吟味する)

埩元(recovery)に぀いお吟味

fsworm が健党であれば、super block をsnextたで蟿れば、 snextを基に埩元できる。では確実にsnextたで蟿れるのか?

fsworm が本圓の WORM あるいは新品のハヌドディスクであれば問題はないであろう。 snextの先に、 Tagらしきデヌタは無いのであるから間違う䜙地は無い。しかし䜿い叀しのハヌドディスクであればどうだろう?
Tagを頌りに super block を蟿る際に、ゎミをTagず勘違いするかもしれない。super block のTag構造䜓は

 struct Tag
 {
	short pad; /* make tag end at a long boundary */
	short tag;
	Off path;
};

であり、 padは 0、 tagは 1、 pathは 2 である。ゎミの䞭で、この 12Bが完党に䞀臎する確率は 2 -96で泚1 、十分に小さいず考えるかもしれない。䜕しろ、fscache がクラッシュする確率自䜓、極めお小さくお、サヌバのラむフタむム(5幎皋床か?)の䞭に、あるか無いかだ。

し かし、こうした確率の蚈算は、ランダムなデヌタが曞き蟌たれおいる事を前提にしおいる。この䜿い叀しのハヌドディスクの fscache パヌティションが、以前に fscache パヌティションずしお利甚されおいたものを、そのたた䜿ったらどうだろう? 誀認される確率は fsworm の䞭での super block の割合たでに䞊がるので、無芖できないかもしれない。埓っお、fscache のパヌティションを䜜る堎合に泚意した方が良いだろう。(パヌティションの先頭アドレスを少しずらすずか...)

fscache の Tcache block の䞭には fsworm の最埌の super block ずの敎合性を確認できる情報が含たれおいる。埓っお通垞の recovery においおはこのような心配はいらないはずである。

泚1: 実際には、 tagずpathだけで蟿っおいるので、確率は 2 -80である。この確率を小さくするためには、他にslastの情報を䜿う手もあろうが、そこたでの䟡倀があるかは怪しい。

Recovery によっお倱われるもの
2013/03/28

free block で倱われるものがある。 free block ずは既に dump された領域に存圚する、ただ曞き蟌たれおいない block である。cwfs は、ここに data を曞き蟌む機䌚があれば曞き蟌もうずする。蚘憶スペヌスを有効に䜿おうずしおいるのである。free block のうち 2037 個は superblock が管理しおおり、この情報は fsworm にあるので倱われない。しかし 2037 個を超えた郚分の free block list は fscache の Tfree block に存圚しおいる。Tfree block は fsworm にコピヌされない。これらは Recovery で倱われる。

Other Configurations
2013/04/02

pseudo-RAID1

結局珟圚の cwfs configuration

 filsys main c(/dev/sdC0/fscache)(/dev/sdC0/fsworm)

の䞋では fsworm の backup を取るのは至難の技であるず諊めお、他の configuration

 filsys main c(/dev/sdC0/fscache){(/dev/sdC0/fsworm)(/dev/sdD0/fsworm)}

を採甚するこずずした。
これは pseudo-RAID1 の configuration である。デバむスたるごずではなく、fsworm partition だけを RAID1 颚に凊理しおくれる。
/dev/sdC0/fswormず/dev/sdD0/fswormはサむズが異なっおも構わない。その堎合には小さい方に合わせられる。
曞き蟌みの順序は、(この堎合には) D0 → C0であり、読み取りはC0で行われる。

ディスク線成の倉曎にあたっおは、準備が必芁である。

    コピヌを取らなくおはならない。partition たるごずは時間がかかりすぎるので、最埌の super block たでのコピヌに制限する必芁がある。
    それでもただ時間が掛かりすぎるかも知れないので、その間の auto dump を止める必芁がある。そのためには、cwfs にパッチをあおないずいけない。

もっずも、RAID は僕が家庭で䜿うには倧げさなのであるが...

家で䜿っおいる限り順調に動いおいる。倧孊のサヌバヌもこれでやるこずにした。

fake WORM

Cinap によるず fake WORM の䞭には written block の bit map があるそうである。この堎合、HDD を WORM の代わりに䜿うのだから、珟圚の利甚状態を瀺す bit map を持぀事は可胜なのである。この堎合の configuration は

 filsys main c(/dev/sdC0/fscache)f(/dev/sdC0/fsworm)

.

これを䜿えば、普段は 1 個の disk を䜿い、気の向いた時に backup disk を远加しおバックアップを取る僕のような気たぐれな人間に適した凊理が可胜であろうず思える。

fake WORM の䜜成

僕 の WORM は通垞の WORM なので、fake WORM を䜜る堎合には、device のコピヌずいう蚳には行かないはずおある。新たに構成する事ずし、安党のために、PXE で立ち䞊げた端末で䜜業するこずずした。local disk には、これから cached fake WORM を構成する plan9 partition を準備しおおく。

 /dev/sdC0/fscache
	/dev/sdC0/fsworm

この䞋で

 cwfs64x -c -f /dev/sdC0/fscache

を実行する泚1 。

泚1: cwfs のコマンドの䜿い方は、Bell-labs 版(Geoff のオリゞナル版)ず 9front 版では異なる。ここでは 9front 版に基づく。9front 版では、kfs や fossil など、他のファむルシステムず -f オプションの䜿い方の統䞀を蚈っおいる。

9front 版では-c option で config mode に入る。 configの prompt に察しお次のデヌタを input する。

 service cwfs
filsys main c(/dev/sdC0/fscache)f(/dev/sdC0/fsworm)
filsys dump o
filsys other (/dev/sdC0/other)
ream other
ream main
end

以䞊は䞀回限りである。

次に cwfs console ぞのコマンドず shell レベルのコマンドが発生する。ここでは cwfs console ぞのコマンドをfscons>で衚す。

以䞋の操䜜は新しい window の䞭で行うのが無難である。

 fscons> users default
fscons> newuser arisawa
fscons> allow

term% mount -c /srv/cwfs /n/cwfs
term% mkdir /n/cwfs/adm
term% cp /adm/users /n/cwfs/adm

fscons> users

泚意: newuser arisawaは、筆者のシステムの system owner はglendaではなくarisawaだから必芁になったのであり、 glendaのたたであれば䞍芁である。

このあずは、筆者のcpdirを䜿うのが早い。

 cpdir -mvug /root /n/cwfs adm 386 acme cfg cron lib mail rc sys

/rootの䞋にあるfd 、 mnt 、 n 、 tmp 、 usrは個別に確認した方が無難である。
特に、 /root/n/の䞋にはcwfsが芋えおいるはずである。

Tvirgo

fakeworm の堎合には cwfs console の statw でᅵᅵᅵ瀺されるwsizeから、 Tvirgo block が始たる。 Tvirgo block は fsworm の block 0 からwsizeた での䜿甚状況を bitmap で衚しおいる。曞き蟌たれた block には bit 1 が立おられ、ただ曞き蟌たれおいない block の bit は 0 である。fsworm の先頭 2 block は曞き蟌たれおいないので、bitmap の最初の 2 bit は 0 である。

fakeworm は fsworm の末尟に bitmap が入り蟌むので、その分、 wsizeは小さくなる。

Misc.

What did I do that day?
2013/03/18

「あの日は䜕をしおいたのだろう?」ず僕が考える堎合には、ファむルの修正などの話であり、飲みに行ったずかの話ではない。
あの日に倉曎されたファむルを党お列挙するには、UNIX では find コマンドを䜿うず思う。膚倧なファむルの䞭から、倉曎されたファむルを探し出す䜜業は(ファむルの量にもよるが)倚くの時間を芁し数秒では終わらない。ちなみに僕の MacBook では僕の$HOMEの探玢だけでも30秒皋芁しおいる。(結構たくさんのファむルを持っおいるせいもある)

 bash$ touch a.txt
bash$ time find $HOME -newer a.txt -print
find: /Users/arisawa/.emacs.d/auto-save-list: Permission denied
/Users/arisawa/Library/Application Support/Google/Chrome/Default/Cookies
 ...
 ...
find: /Users/arisawa/src/rminnich-vx32-17a064eed9c2/src/9vx/osx/9vx.app: Permission denied

real 0m28.372s
user 0m0.783s
sys 0m18.783s
bash$

ここで玹介するのは僕の䜜った lr コマンドであり、find の -newer オプションに盞圓するオプションが存圚する。これを䜿っお昚日に倉曎されたファむルをサヌバヌの党おのファむルの䞭から芋぀けるには次のようにする。

 term% cpu -h ar
ar% 9fs dump
mounting as arisawa
mounting as arisawa
ar% ls /n/dump/2013|tail -2
/n/dump/2013/0317
/n/dump/2013/0318
ar% mtime /n/dump/2013/0317
 1363498134 /n/dump/2013/0317
ar% time lr -lt 1363498134 /n/dump/2013/0318
 ...
 ...
--rw-rw-rw- web arisawa 5819730 2013/03/18 12:54:03 /n/dump/2013/0318/usr/cpa/www/log/dict
d-rwxrwxrwx arisawa arisawa 0 2013/03/17 21:51:56 /n/dump/2013/0318/usr/cpa/www/users
 ...
 ...
0.01u 0.18s 1.91r lr -lt 1363498134 /n/dump/2013/0318
ar%

この日には33個のファむルの倉曎があった。倚くは log ファむルである。倉曎されたファむルの䞭には web の cgi に拠るものもある。システムの党おのファむルを探しおいるのだが2秒匱で探玢が完了しおいる。僕は膚倧なファむルをサヌバヌ䞊に持っおいるにも係わらずで ある!

なぜこんなに高速に倉曎を調べられるのか?

探玢にatimeが利甚されおいるからである。 atimeずは access time の意味である。マニュアルを芋おもそれ以䞊に詳しい説明はない。(Plan9 のマニュアルには read time ず曞いおあるが、write に察しおもatimeが曎新される)

泚: lr は

 http://plan9.aichi-u.ac.jp/netlib/cmd/lr/

に眮かれおいる。

atime
2013/06/06

実際の動䜜を芋おいるず、Plan9 ず UNIX(MacOSX や Linux) で振る舞いが異なる。

Plan9 の堎合には、ファむルサヌバがファむルを探し出すために蟿ったルヌトに存圚する党おのディレクトリのatimeが曎新されおいる。膚倧な directory tree の䞭で、指定された日に実際にファむルサヌバが蟿った道は極く極く僅かである。埓っおatimeを芋おいれば、必芁な探玢のルヌトを倧幅に枛らす事が可胜である。

UNIX では違う。ファむルサヌバがファむルを探し出すために蟿ったルヌトに存圚するディレクトリのatimeは曎新されおいない。倉曎が実際に発生したディレクトリやファむルのatimeだけが曎新されおいる。埓っお、 atimeを頌りに、曎新を効率的に探し出す事はできない。

以䞋に、Plan9 ず UNIX の atime の違いを具䜓䟋で瀺す。

Plan9

 # Plan9
term% date; touch $home/doc/x;ls -dlu /usr $home/doc
Wed Jun 5 07:58:17 JST 2013
d-rwxrwxr-x M 20 sys sys 0 Jun 5 07:58 /usr
d-rwxrwxr-x M 20 arisawa arisawa 0 Jun 5 07:58 /usr/arisawa/doc
 term%

Linux

 # UNIX (Linux)
hebe$ date; touch $HOME/doc/x; ls -dlu /home $HOME/doc
Wed Jun 5 07:56:41 JST 2013
drwxr-xr-x 3 root root 4096 Jun 4 09:49 /home
drwxr-xr-x 9 arisawa arisawa 4096 Jun 5 07:46 /home/arisawa/doc
hebe$

OSX

 # UNIX (OSX)
-bash$ date; touch $HOME/doc/x; ls -dlu /Users $HOME/doc
Wed Jun 5 08:08:27 JST 2013
drwxr-xr-x 6 root admin 204 May 31 07:51 /Users
drwxr-xr-x 3 arisawa staff 102 Jun 5 08:03 /Users/arisawa/doc
-bash$

cwstudy

This section is incomplete.

usage

cwstudy block_address

cwstudy -C block_address

cwstudy path

cwstudy super

文献

[1] Sean Quinlan “A Cached WORM File System”
Softw., Pract. Exper., vol. 21 (1991), pp. 1289-1299
http://plan9.bell-labs.com/who/seanq/cw.pdf

[2] Ken Thompson, Geoff Collyer “The 64-bit Standalone Plan 9 File Server”
http://plan9.bell-labs.com/sys/doc/fs/fs.pdf