URL: https://www.usenix.org/legacy/events/usenix99/full_papers/olson/olson.pdf
%PDF-1.2
%����
2 0 obj
<<
/Length 6913
>>
stream
BT
/F8 1 Tf
14 0 0 14 275.358 691.2 Tm
0 g
/GS1 gs
0 Tc
0 Tw
[(Berk)10(eley DB)]TJ
/F6 1 Tf
12 0 0 12 270.372 662.4 Tm
(Michael A. Olson)Tj
1.0675 -1.2 TD
[(K)25(eith Bostic)]TJ
-0.336 -1.2 TD
[(Mar)18(go Seltzer)]TJ
/F4 1 Tf
-1.9715 -1.32 TD
[(Sleepycat Softwar)37(e)10(,)10( )-10(Inc.)]TJ
/F8 1 Tf
2.9485 -3 TD
(Abstract)Tj
/F6 1 Tf
10 0 0 10 79.2 565.56 Tm
0.042 Tw
[(Berk)10(ele)15(y)-292(D)0(B)-292(i)0(s)-292(a)0(n)-292(Open Source embedded database system with a number of k)10(e)15(y)15( )-15(adv)25(antages o)15(v)15(e)0(r)-292(comparable sys-)]TJ
0 -1.2 TD
0.06 Tw
[(tems. )-250(It)-310(is simple to use, supports concurrent access by multiple users, and pro)15(vides industrial-strength transaction)]TJ
T*
0.155 Tw
[(support, including survi)25(ving system and disk crashes.)-655(This paper describes the design and technical features of)]TJ
T*
0 Tw
[(Berk)10(ele)15(y)-250(DB, the distrib)20(ution, and its license.)]TJ
/F8 1 Tf
12 0 0 12 79.2 505.56 Tm
0.25 Tw
[(1. Intr)18(oduction)]TJ
/F6 1 Tf
10 0 0 10 79.2 489.36 Tm
0.069 Tw
[(The Berk)10(ele)15(y)-319(Database \(Berk)10(ele)15(y)-319(DB\) is an embedded)]TJ
T*
0.025 Tw
[(database system that can be used in applications requir)20(-)]TJ
T*
0.164 Tw
[(ing high-performance concurrent storage and retrie)25(v)25(a)0(l)]TJ
T*
0.262 Tw
[(of k)10(e)15(y/v)25(alue pairs.)-762(The softw)10(are is distrib)20(uted as a)]TJ
T*
0.006 Tw
[(library that can be link)10(ed directly into an application.)-506(It)]TJ
T*
0.145 Tw
[(pro)15(vides a v)25(ariety of programmatic interf)10(aces, includ-)]TJ
T*
0.024 Tw
[(ing callable APIs for C, C++, Perl, Tcl and Ja)20(v)25(a)0(.)-524(Users)]TJ
T*
0.033 Tw
[(may do)25(wnload Berk)10(ele)15(y)-283(D)0(B)-283(from Sleep)10(ycat Softw)10(are')55(s)]TJ
T*
0 Tw
[(W)80(e)0(b)-250(site, at)]TJ
/F4 1 Tf
4.9187 0 TD
[(www)74(.sleepycat.com)]TJ
/F6 1 Tf
7.8135 0 TD
(.)Tj
-12.7322 -1.62 TD
0.133 Tw
[(Sleep)10(ycat distrib)20(utes Berk)10(ele)15(y)-383(D)0(B)-383(a)0(s)-383(a)0(n)-383(Open Source)]TJ
0 -1.2 TD
0.08 Tw
[(product. )-250(The)-330(compan)15(y)-330(collects license fees for certain)]TJ
T*
0 Tw
[(uses of the softw)10(are and sells support and services.)]TJ
/F8 1 Tf
12 0 0 12 79.2 323.16 Tm
0.25 Tw
(1.1. History)Tj
/F6 1 Tf
10 0 0 10 79.2 306.96 Tm
0.056 Tw
[(Berk)10(ele)15(y)-306(D)0(B)-306(b)0(e)15(g)15( )296(an)-306(as)-306(a)-306(n)0(e)25(w)25( )-25(implementation of a hash)]TJ
T*
0.084 Tw
(access method to replace both)Tj
/F2 1 Tf
12.6677 0 TD
(hsearch)Tj
/F6 1 Tf
4.5339 0 TD
[(and the v)25(ari-)]TJ
-17.2016 -1.2 TD
(ous)Tj
/F2 1 Tf
1.9355 0 TD
(dbm)Tj
/F6 1 Tf
2.3465 0 TD
0.297 Tw
(implementations \()Tj
/F2 1 Tf
7.5463 0 TD
(dbm)Tj
/F6 1 Tf
2.3466 0 TD
[(from A)111(T&T)74(,)]TJ
/F2 1 Tf
5.8241 0 TD
(ndbm)Tj
/F6 1 Tf
-19.9989 -1.2 TD
0.133 Tw
[(from Berk)10(ele)15(y)65(,)65( )-65(and)]TJ
/F2 1 Tf
8.3077 0 TD
(gdbm)Tj
/F6 1 Tf
2.7833 0 TD
[(from the GNU project\).)-633(In)]TJ
-11.091 -1.2 TD
0.037 Tw
[(1990 Seltzer and Y)55(igit produced a package called Hash)]TJ
T*
0 Tw
(to do this [Selt91].)Tj
0 -1.62 TD
0.311 Tw
[(The \214rst general release of Berk)10(ele)15(y)-561(DB, in 1991,)]TJ
0 -1.2 TD
0.304 Tw
[(included some interf)10(ace changes and a ne)25(w)-554(B+tree)]TJ
T*
0.089 Tw
[(access method.)-589(At roughly the same time, Seltzer and)]TJ
T*
0.12 Tw
[(Olson de)25(v)15(eloped a prototype transaction system based)]TJ
T*
0.335 Tw
[(on Berk)10(ele)15(y)-586(DB, called LIBTP [Selt92], b)20(ut ne)25(v)15(e)0(r)]TJ
T*
0 Tw
(released the code.)Tj
0 -1.62 TD
0.065 Tw
[(The 4.4BSD UNIX release included Berk)10(ele)15(y)-315(D)0(B)-315(1.85)]TJ
0 -1.2 TD
0.06 Tw
[(in 1992.)-560(Seltzer and Bostic maintained the code in the)]TJ
T*
0.154 Tw
[(early 1990s in Berk)10(ele)15(y)-405(and in Massachusetts.)-655(Man)15(y)]TJ
T*
0 Tw
(users adopted the code during this period.)Tj
0 -1.62 TD
0.043 Tw
[(By mid-1996, users w)10(anted commercial support for the)]TJ
0 -1.2 TD
0.453 Tw
[(softw)10(are. )-250(In)-703(response, Bostic and Seltzer formed)]TJ
T*
1.013 Tw
[(Sleep)10(ycat Softw)10(are. )-250(The)-1263(compan)15(y)-1513(enhances,)]TJ
24.4 42.72 TD
0.162 Tw
[(distrib)20(utes, and supports Berk)10(ele)15(y)-412(D)0(B)-412(and supporting)]TJ
0 -1.2 TD
0.22 Tw
[(softw)10(are and documentation.)-720(Sleep)10(ycat released v)15(e)0(r)20(-)]TJ
T*
0.168 Tw
[(sion 2.1 of Berk)10(ele)15(y)-418(D)0(B)-418(i)0(n)-418(mid-1997 with important)]TJ
T*
0.006 Tw
[(ne)25(w)-256(features, including support for concurrent access to)]TJ
T*
0.168 Tw
[(databases. )-250(The)-418(compan)15(y)-418(mak)10(es about three commer)20(-)]TJ
T*
0.096 Tw
[(cial releases a year)40(,)-346(and most recently shipped v)15(ersion)]TJ
T*
(2.8.)Tj
/F8 1 Tf
12 0 0 12 323.2 403.56 Tm
0 Tw
[(1.2. )-250(Ov)10(er)10(view of Berk)10(eley DB)]TJ
/F6 1 Tf
10 0 0 10 323.2 387.36 Tm
0.309 Tw
[(The C interf)10(aces in Berk)10(ele)15(y)-559(D)0(B)-559(permit)]TJ
/F2 1 Tf
18.3771 0 TD
(dbm)Tj
/F6 1 Tf
1.7999 0 TD
(-style)Tj
-20.1769 -1.2 TD
0.459 Tw
(record management for databases, with signi\214cant)Tj
T*
0.127 Tw
[(e)15(xtensions to handle duplicate data items ele)15(gantly)65(,)-377(t)0(o)]TJ
T*
0.243 Tw
[(deal with concurrent access, and to pro)15(vide transac-)]TJ
T*
0.071 Tw
(tional support so that multiple changes can be simulta-)Tj
T*
0.127 Tw
[(neously committed \(so that the)15(y)-377(are made permanent\))]TJ
T*
0.185 Tw
(or rolled back \(so that the database is restored to its)Tj
T*
0 Tw
[(state at the be)15(ginning of the transaction\).)]TJ
0 -1.62 TD
0.103 Tw
[(C++ and Ja)20(v)25(a)25( )-25(interf)10(aces pro)15(vide a small set of classes)]TJ
0 -1.2 TD
0.196 Tw
[(for operating on a database.)-696(The main class in both)]TJ
T*
0.059 Tw
(cases is called)Tj
/F2 1 Tf
6.0907 0 TD
(Db)Tj
/F6 1 Tf
1.1999 0 TD
[(,)-309(and pro)15(vides methods that encapsu-)]TJ
-7.2906 -1.2 TD
0.113 Tw
(late the)Tj
/F2 1 Tf
3.3914 0 TD
(dbm)Tj
/F6 1 Tf
1.7999 0 TD
[(-style interf)10(aces that the C interf)10(aces pro-)]TJ
-5.1913 -1.2 TD
(vide.)Tj
0 -1.62 TD
0.256 Tw
[(Tcl and Perl interf)10(aces allo)25(w)-506(d)0(e)25(v)25( )496(elopers w)10(orking in)]TJ
0 -1.2 TD
0.172 Tw
[(those languages to use Berk)10(ele)15(y)-422(D)0(B)-422(i)0(n)-422(their applica-)]TJ
T*
0.092 Tw
[(tions. )-250(Bindings)-342(for both languages are included in the)]TJ
T*
[(distrib)20(ution.)]TJ
0 -1.62 TD
0.107 Tw
[(De)25(v)15(elopers may compile their applications and link in)]TJ
0 -1.2 TD
0 Tw
[(Berk)10(ele)15(y)-250(D)0(B)-250(statically or dynamically)65(.)]TJ
/F8 1 Tf
12 0 0 12 323.2 128.76 Tm
[(1.3. )-250(Ho)10(w)-250(Berk)10(eley DB is used)]TJ
/F6 1 Tf
10 0 0 10 323.2 112.56 Tm
0.065 Tw
[(The Berk)10(ele)15(y)-315(D)0(B)-315(library supports concurrent access to)]TJ
T*
0.262 Tw
[(databases. )-249(It)-511(can be link)10(ed into standalone applica-)]TJ
T*
0.149 Tw
(tions, into a collection of cooperating applications, or)Tj
T*
0.421 Tw
[(into serv)15(ers that handle requests and do database)]TJ
ET
endstream
endobj
3 0 obj
<<
/ProcSet [/PDF /Text ]
/Font <<
/F2 4 0 R
/F4 5 0 R
/F6 6 0 R
/F8 7 0 R
>>
/ExtGState <<
/GS1 8 0 R
>>
>>
endobj
11 0 obj
<<
/Length 8000
>>
stream
BT
/F6 1 Tf
10 0 0 10 79.2 708 Tm
0 g
/GS1 gs
0 Tc
0 Tw
(operations on behalf of clients.)Tj
0 -1.62 TD
0.086 Tw
(Compared to using a standalone database management)Tj
0 -1.2 TD
0.085 Tw
[(system, Berk)10(ele)15(y)-335(D)0(B)-335(i)0(s)-335(easy to understand and simple)]TJ
T*
0.383 Tw
[(to use.)-883(The softw)10(are stores and retrie)25(v)15(e)0(s)-632(records,)]TJ
T*
0.277 Tw
[(which consist of k)10(e)15(y/v)25(alue pairs.)-777(K)25(e)25( )517(ys)-527(are used to)]TJ
T*
0.07 Tw
[(locate items and can be an)15(y)-320(data type or structure sup-)]TJ
T*
0 Tw
(ported by the programming language.)Tj
0 -1.62 TD
0.081 Tw
[(The programmer can pro)15(vide the functions that Berk)10(e-)]TJ
0 -1.2 TD
0.076 Tw
[(le)15(y)-326(D)0(B)-326(uses to operate on k)10(e)15(ys. )-250(F)15(or e)15(xample, B+trees)]TJ
T*
0.172 Tw
(can use a custom comparison function, and the Hash)Tj
T*
0.052 Tw
[(access method can use a custom hash function.)-552(Berk)10(e-)]TJ
T*
0.272 Tw
[(le)15(y)-522(D)0(B)-522(uses def)10(ault functions if none are supplied.)]TJ
T*
0.087 Tw
[(Otherwise, Berk)10(ele)15(y)-337(D)0(B)-337(does not e)15(xamine or interpret)]TJ
T*
0.093 Tw
[(either k)10(e)15(ys)-343(or)-343(v)25(alues in an)15(y)-343(w)10(ay)65(.)-593(V)111(alues may be arbi-)]TJ
T*
0 Tw
(trarily long.)Tj
0 -1.62 TD
0.069 Tw
[(It is also important to understand what Berk)10(ele)15(y)-319(D)0(B)-319(i)0(s)]TJ
0 -1.2 TD
0.186 Tw
[(not. )-250(It)-436(is not a database serv)15(er that handles netw)10(ork)]TJ
T*
0.03 Tw
[(requests. )-250(It)-280(is not an SQL engine that e)15(x)15(ecutes queries.)]TJ
T*
0.155 Tw
(It is not a relational or object-oriented database man-)Tj
T*
0 Tw
(agement system.)Tj
0 -1.62 TD
0.11 Tw
[(It is possible to b)20(uild an)15(y)-360(o)0(f)-360(those on top of Berk)10(ele)15(y)]TJ
0 -1.2 TD
0.212 Tw
[(DB, b)20(ut the package, as distrib)20(uted, is an embedded)]TJ
T*
0.144 Tw
[(database engine.)-644(It has been designed to be portable,)]TJ
T*
0 Tw
[(small, f)10(ast, and reliable.)]TJ
/F8 1 Tf
12 0 0 12 79.2 385.2 Tm
[(1.4. )-250(A)25(pplications that use Berk)10(eley DB)]TJ
/F6 1 Tf
10 0 0 10 79.2 369 Tm
0.175 Tw
[(Berk)10(ele)15(y)-425(D)0(B)-425(i)0(s)-425(embedded in a v)25(ariety of proprietary)]TJ
T*
0.384 Tw
[(and Open Source softw)10(are packages.)-884(This section)]TJ
T*
0 Tw
[(highlights a fe)25(w)-250(o)0(f)-250(the products that use it.)]TJ
0 -1.62 TD
0.147 Tw
[(Directory serv)15(ers, which do data storage and retrie)25(v)25(a)0(l)]TJ
0 -1.2 TD
0.282 Tw
[(using the Local Directory Access Protocol \(LD)40(AP\),)]TJ
T*
0.096 Tw
[(pro)15(vide naming and directory lookup service on local-)]TJ
T*
0.284 Tw
[(area netw)10(orks. )-250(This)-534(service is, essentially)65(,)-534(database)]TJ
T*
0.004 Tw
[(query and update, b)20(ut uses a simple protocol rather than)]TJ
T*
0.22 Tw
[(SQL or ODBC.)-720(Berk)10(ele)15(y)-470(D)0(B)-470(i)0(s)-470(the embedded data)]TJ
T*
0.129 Tw
[(manager in the majority of deplo)10(yed directory serv)15(ers)]TJ
T*
0.235 Tw
[(today)65(,)-485(including LD)40(AP serv)15(ers from Netscape, Mes-)]TJ
T*
0 Tw
(sageDirect \(formerly Isode\), and others.)Tj
0 -1.62 TD
0.189 Tw
[(Berk)10(ele)15(y)-438(D)0(B)-438(i)0(s)-438(also embedded in a lar)18(ge number of)]TJ
0 -1.2 TD
0.53 Tw
[(mail serv)15(ers. )-250(Intermail,)-780(from Softw)10(are.com, uses)]TJ
T*
0.211 Tw
[(Berk)10(ele)15(y)-461(D)0(B)-461(a)0(s)-461(a)-461(message store and as the backing)]TJ
T*
0.36 Tw
[(store for its directory serv)15(er)55(.)-860(The sendmail serv)15(er)]TJ
T*
0.117 Tw
[(\(including both the commercial Sendmail Pro of)25(fering)]TJ
T*
0.328 Tw
[(from Sendmail, Inc. and the v)15(ersion distrib)20(uted by)]TJ
T*
0.23 Tw
[(sendmail.or)18(g\) uses Berk)10(ele)15(y)-480(D)0(B)-480(t)0(o)-480(store aliases and)]TJ
T*
0.901 Tw
[(other information.)-1401(Similarly)65(,)-1151(Post\214x \(formerly)]TJ
T*
0.346 Tw
[(VMailer\) uses Berk)10(ele)15(y)-596(D)0(B)-596(t)0(o)-596(store administrati)25(v)15(e)]TJ
T*
(information.)Tj
0 -1.62 TD
0.013 Tw
[(In addition, Berk)10(ele)15(y)-263(D)0(B)-263(i)0(s)-263(embedded in a wide v)25(ariety)]TJ
0 -1.2 TD
0.499 Tw
[(of other softw)10(are products.)-999(Example applications)]TJ
24.4 62.76 TD
0.037 Tw
[(include managing access control lists, storing user k)10(e)15(ys)]TJ
0 -1.2 TD
0.275 Tw
[(in a public-k)10(e)15(y)15( )-15(infrastructure, recording machine-to-)]TJ
T*
0.052 Tw
[(netw)10(ork-address mappings in address serv)15(ers, and stor)20(-)]TJ
T*
0.041 Tw
[(ing con\214guration and de)25(vice information in video post-)]TJ
T*
0 Tw
[(production softw)10(are.)]TJ
0 -1.62 TD
0.248 Tw
[(Finally)65(,)-498(Berk)10(ele)15(y)-498(D)0(B)-498(i)0(s)-498(a)-498(part of man)15(y)-498(other Open)]TJ
0 -1.2 TD
0 Tw
[(Source softw)10(are packages a)20(v)25(ailable on the Internet.)-501(F)15(o)0(r)]TJ
T*
0.06 Tw
[(e)15(xample, the softw)10(are is embedded in the Apache W)80(e)0(b)]TJ
T*
0 Tw
[(serv)15(er and the Gnome desktop.)]TJ
/F8 1 Tf
12 0 0 12 323.2 577.8 Tm
0.25 Tw
[(2. Access)-250(Methods)]TJ
/F6 1 Tf
10 0 0 10 323.2 561.6 Tm
0.083 Tw
[(In database terminology)65(,)-333(a)0(n)-333(access method is the disk-)]TJ
T*
0.196 Tw
(based structure used to store data and the operations)Tj
T*
0.605 Tw
[(a)20(v)25(ailable on that structure.)-1105(F)15(o)0(r)-855(e)15(xample, man)15(y)]TJ
T*
0.385 Tw
(database systems support a B+tree access method.)Tj
T*
0.12 Tw
[(B+trees allo)25(w)-370(equality-based lookups \(\214nd k)10(e)15(ys)-370(equal)]TJ
T*
0.4 Tw
[(to some constant\), range-based lookups \(\214nd k)10(e)15(ys)]TJ
T*
0.119 Tw
[(between tw)10(o)-369(constants\) and record insertion and dele-)]TJ
T*
(tion.)Tj
0 -1.62 TD
0.223 Tw
[(Berk)10(ele)15(y)-473(D)0(B)-473(supports three access methods: B+tree,)]TJ
0 -1.2 TD
0.155 Tw
[(Extended Linear Hashing \(Hash\), and Fix)15(ed- or V)111(ari-)]TJ
T*
0.364 Tw
[(able-length Records \(Recno\).)-864(All three operate on)]TJ
T*
0.196 Tw
[(records composed of a k)10(e)15(y)15( )-15(and a data v)25(alue. )-250(In)-446(the)]TJ
T*
0.13 Tw
[(B+tree and Hash access methods, k)10(e)15(ys)-380(can ha)20(v)15(e)15( )-15(arbi-)]TJ
T*
0.359 Tw
[(trary structure.)-859(In the Recno access method, each)]TJ
T*
0.026 Tw
[(record is assigned a record number)40(,)-276(which serv)15(es as the)]TJ
T*
0.031 Tw
[(k)10(e)15(y)65(.)65( )-315(In)-281(all the access methods, the v)25(alue can ha)20(v)15(e)15( )-15(arbi-)]TJ
T*
0.142 Tw
[(trary structure.)-642(The programmer can supply compari-)]TJ
T*
0.213 Tw
[(son or hashing functions for k)10(e)15(ys, and Berk)10(ele)15(y)-463(D)0(B)]TJ
T*
0 Tw
[(stores and retrie)25(v)15(e)0(s)-250(v)25(alues without interpreting them.)]TJ
0 -1.62 TD
0.107 Tw
(All of the access methods use the host \214lesystem as a)Tj
0 -1.2 TD
0 Tw
(backing store.)Tj
/F8 1 Tf
12 0 0 12 323.2 283.2 Tm
0.25 Tw
(2.1. Hash)Tj
/F6 1 Tf
10 0 0 10 323.2 267 Tm
0.399 Tw
[(Berk)10(ele)15(y)-648(D)0(B)-648(includes a Hash access method that)]TJ
T*
0.986 Tw
[(implements e)15(xtended linear hashing [Litw80].)]TJ
T*
0.002 Tw
(Extended linear hashing adjusts the hash function as the)Tj
T*
0.051 Tw
[(hash table gro)25(ws, attempting to k)10(eep all b)20(uck)10(ets under)20(-)]TJ
T*
0 Tw
(full in the steady state.)Tj
0 -1.62 TD
0.165 Tw
(The Hash access method supports insertion and dele-)Tj
0 -1.2 TD
0.026 Tw
[(tion of records and lookup by e)15(xact match only)65(.)-526(Appli-)]TJ
T*
0.004 Tw
[(cations may iterate o)15(v)15(e)0(r)-254(all records stored in a table, b)20(u)0(t)]TJ
T*
0 Tw
[(the order in which the)15(y)-250(are returned is unde\214ned.)]TJ
/F8 1 Tf
12 0 0 12 323.2 136.8 Tm
0.25 Tw
[(2.2. B+tr)18(ee)]TJ
/F6 1 Tf
10 0 0 10 323.2 120.6 Tm
0.468 Tw
[(Berk)10(ele)15(y)-718(D)0(B)-718(includes a B+tree [Come79] access)]TJ
T*
0 Tw
[(method. )-250(B+trees)-250(store records of k)10(e)15(y/v)25(alue pairs in leaf)]TJ
T*
0.052 Tw
[(pages, and pairs of \(k)10(e)15(y)65(,)-302(child page address\) at internal)]TJ
T*
0.288 Tw
[(nodes. )-250(K)25(e)15(ys)-538(in)-538(the tree are stored in sorted order)40(,)]TJ
ET
endstream
endobj
12 0 obj
<<
/ProcSet [/PDF /Text ]
/Font <<
/F6 6 0 R
/F8 7 0 R
>>
/ExtGState <<
/GS1 8 0 R
>>
>>
endobj
14 0 obj
<<
/Length 8348
>>
stream
BT
/F6 1 Tf
10 0 0 10 79.2 708 Tm
0 g
/GS1 gs
0 Tc
0.058 Tw
(where the order is determined by the comparison func-)Tj
0 -1.2 TD
0.081 Tw
[(tion supplied when the database w)10(as created.)-581(P)15(ages at)]TJ
T*
0.039 Tw
[(the leaf le)25(v)15(e)0(l)-289(o)0(f)-289(the tree include pointers to their neigh-)]TJ
T*
0.144 Tw
[(bors to simplify tra)20(v)15(ersal. )-250(B+trees)-394(support lookup by)]TJ
T*
0.007 Tw
[(e)15(xact match \(equality\) or range \(greater than or equal to)]TJ
T*
0.039 Tw
[(a)-289(k)10(e)15(y\). )-250(Lik)10(e)-289(Hash tables, B+trees support record inser)20(-)]TJ
T*
0 Tw
[(tion, deletion, and iteration o)15(v)15(e)0(r)-250(all records in the tree.)]TJ
0 -1.62 TD
0.065 Tw
(As records are inserted and pages in the B+tree \214ll up,)Tj
0 -1.2 TD
0.022 Tw
[(the)15(y)-272(are split, with about half the k)10(e)15(ys)-272(going into a ne)25(w)]TJ
T*
0.16 Tw
[(peer page at the same le)25(v)15(e)0(l)-410(i)0(n)-410(the tree.)-660(Most B+tree)]TJ
T*
0.039 Tw
[(implementations lea)20(v)15(e)15( )-15(both nodes half-full after a split.)]TJ
T*
0.276 Tw
(This leads to poor performance in a common case,)Tj
T*
0.152 Tw
[(where the caller inserts k)10(e)15(ys)-402(in)-402(order)55(.)-652(T)80(o)-402(handle this)]TJ
T*
0.164 Tw
[(case, Berk)10(ele)15(y)-414(D)0(B)-414(k)10(eeps track of the insertion order)40(,)]TJ
T*
0.202 Tw
[(and splits pages une)25(v)15(enly to k)10(eep pages fuller)55(.)-702(This)]TJ
T*
0.23 Tw
(reduces tree size, yielding better search performance)Tj
T*
0 Tw
(and smaller databases.)Tj
0 -1.62 TD
0.318 Tw
[(On deletion, empty pages are coalesced by re)25(v)15(erse)]TJ
0 -1.2 TD
0.203 Tw
[(splits into single pages.)-703(The access method does no)]TJ
T*
0.035 Tw
[(other page balancing on insertion or deletion.)-535(K)25(e)25( )275(ys)-285(are)]TJ
T*
0.193 Tw
[(not mo)15(v)15(e)0(d)-443(among pages at e)25(v)15(ery update to k)10(eep the)]TJ
T*
0.221 Tw
[(tree well-balanced.)-721(While this could impr)]TJ
17.9603 0 TD
-0.015 Tc
(ove )Tj
1.8846 0 TD
0 Tc
(search)Tj
-19.8449 -1.2 TD
0.234 Tw
[(times in some cases, the additional code comple)15(xity)]TJ
T*
0 Tw
[(leads to slo)25(wer updates and is prone to deadlocks.)]TJ
0 -1.62 TD
0.045 Tw
[(F)15(o)0(r)-295(simplicity)65(,)-295(Berk)10(ele)15(y)-295(D)0(B)-295(B+trees do no pre\214x com-)]TJ
0 -1.2 TD
0 Tw
[(pression of k)10(e)15(ys)-250(at)-250(internal or leaf nodes.)]TJ
/F8 1 Tf
12 0 0 12 79.2 365.4 Tm
0.25 Tw
(2.3. Recno)Tj
/F6 1 Tf
10 0 0 10 79.2 349.2 Tm
0.023 Tw
[(Berk)10(ele)15(y)-274(D)0(B)-274(includes a \214x)15(ed- or v)25(ariable-length record)]TJ
T*
0.507 Tw
(access method, called)Tj
/F4 1 Tf
10.464 0 TD
(Recno)Tj
/F6 1 Tf
2.4988 0 TD
[(.)-1007(The Recno access)]TJ
-12.9628 -1.2 TD
0.09 Tw
(method assigns logical record numbers to each record,)Tj
T*
0.098 Tw
(and can search for and update records by record num-)Tj
T*
0.004 Tw
[(ber)55(.)-504(Recno is able, for e)15(xample, to load a te)15(xt \214le into a)]TJ
T*
0.151 Tw
[(database, treating each line as a record.)-651(This permits)]TJ
T*
0.131 Tw
[(f)10(ast searches by line number for applications lik)10(e)-381(t)0(e)15(x)0(t)]TJ
T*
0 Tw
(editors [Ston82].)Tj
0 -1.62 TD
0.259 Tw
[(Recno is actually b)20(uilt on top of the B+tree access)]TJ
0 -1.2 TD
0.319 Tw
[(method and pro)15(vides a simple interf)10(ace for storing)]TJ
T*
0.314 Tw
[(sequentially-ordered data v)25(alues. )-250(The)-564(Recno access)]TJ
T*
0.227 Tw
[(method generates k)10(e)15(ys)-477(internally)65(.)-727(The programmer')55(s)]TJ
T*
0.16 Tw
[(vie)25(w)-410(o)0(f)-410(the v)25(alues is that the)15(y)-410(are numbered sequen-)]TJ
T*
0.025 Tw
[(tially from one.)-525(De)25(v)15(elopers can choose to ha)20(v)15(e)15( )-15(records)]TJ
T*
0.9 Tw
[(automatically renumbered when lo)25(wer)20(-numbered)]TJ
T*
0.004 Tw
[(records are added or deleted.)-504(In this case, ne)25(w)-254(k)10(e)15(y)0(s)-254(can)]TJ
T*
0 Tw
[(be inserted between e)15(xisting k)10(e)15(ys.)]TJ
/F8 1 Tf
12 0 0 12 79.2 123 Tm
0.25 Tw
[(3. F)25(eatur)18(es)]TJ
/F6 1 Tf
10 0 0 10 79.2 106.8 Tm
0.183 Tw
[(This section describes important features of Berk)10(ele)15(y)]TJ
T*
0.096 Tw
[(DB. )-250(In)-346(general, de)25(v)15(elopers can choose which features)]TJ
T*
0.049 Tw
(are useful to them, and use only those that are required)Tj
24.4 62.52 TD
0 Tw
(by their application.)Tj
0 -1.62 TD
0.103 Tw
[(F)15(o)0(r)-353(e)15(xample, when an application opens a database, it)]TJ
0 -1.2 TD
0.01 Tw
[(can declare the de)15(gree of concurrenc)15(y)-260(and reco)15(v)15(ery that)]TJ
T*
0.005 Tw
[(it requires.)-505(Simple stand-alone applications, and in par)20(-)]TJ
T*
0.049 Tw
(ticular ports of applications that used)Tj
/F2 1 Tf
15.3477 0 TD
(dbm)Tj
/F6 1 Tf
2.099 0 TD
(or one of its)Tj
-17.4467 -1.2 TD
0.109 Tw
[(v)25(ariants, generally do not require concurrent access or)]TJ
T*
0.097 Tw
[(crash reco)15(v)15(ery)65(.)-597(Other applications, such as enterprise-)]TJ
T*
0.308 Tw
(class database management systems that store sales)Tj
T*
0.264 Tw
(transactions or other critical data, need full transac-)Tj
T*
0.393 Tw
[(tional service.)-893(Single-user operation is f)10(aster than)]TJ
T*
0.117 Tw
[(multi-user operation, since no o)15(v)15(erhead is incurred by)]TJ
T*
0.065 Tw
[(locking. )-251(Running)-316(with the reco)15(v)15(ery system disabled is)]TJ
T*
0.173 Tw
[(f)10(aster than running with it enabled, since log records)]TJ
T*
0.27 Tw
(need not be written when changes are made to the)Tj
T*
(database.)Tj
0 -1.62 TD
0.085 Tw
(In addition, some core subsystems, including the lock-)Tj
0 -1.2 TD
0.034 Tw
[(ing system and the logging f)10(acility)65(,)-284(can be used outside)]TJ
T*
0.177 Tw
[(the conte)15(xt of the access methods as well.)-677(Although)]TJ
T*
0.178 Tw
[(fe)25(w)-428(users ha)20(v)15(e)15( )-15(chosen to do so, it is possible to use)]TJ
T*
0.094 Tw
[(only the lock manager in Berk)10(ele)15(y)-344(D)0(B)-344(t)0(o)-344(control con-)]TJ
T*
0.224 Tw
[(currenc)15(y)-474(i)0(n)-474(a)0(n)-474(application, without using an)15(y)-474(o)0(f)-474(the)]TJ
T*
0.016 Tw
[(standard database services.)-516(Alternati)25(v)15(ely)65(,)-266(the caller can)]TJ
T*
0.007 Tw
[(inte)15(grate locking of non-database resources with Berk)10(e-)]TJ
T*
0.27 Tw
[(le)15(y)-520(DB')55(s)-520(transactional tw)10(o-phase locking system, to)]TJ
T*
0.289 Tw
(impose transaction semantics on objects outside the)Tj
T*
(database.)Tj
/F8 1 Tf
12 0 0 12 323.2 369.6 Tm
0.25 Tw
[(3.1. Pr)18(ogrammatic )250(interfaces)]TJ
/F6 1 Tf
10 0 0 10 323.2 353.4 Tm
0.151 Tw
[(Berk)10(ele)15(y)-401(D)0(B)-401(de\214nes a simple API for database man-)]TJ
T*
0.095 Tw
[(agement. )-250(The)-345(package does not include industry-stan-)]TJ
T*
0.19 Tw
[(dard programmatic interf)10(aces such as Open Database)]TJ
T*
0.085 Tw
[(Connecti)25(vity \(ODBC\), Object Linking and Embedding)]TJ
T*
0.082 Tw
(for Databases \(OleDB\), or Structured Query Language)Tj
T*
0.153 Tw
[(\(SQL\). )-250(These)-403(interf)10(aces, while useful, were designed)]TJ
T*
0.248 Tw
(to promote interoperability of database systems, and)Tj
T*
0 Tw
(not simplicity or performance.)Tj
0 -1.62 TD
0.319 Tw
[(In response to customer demand, Berk)10(ele)15(y)-569(D)0(B)-569(2.5)]TJ
0 -1.2 TD
0.054 Tw
[(introduced support for the XA standard [Open94].)-554(XA)]TJ
T*
0.052 Tw
[(permits Berk)10(ele)15(y)-302(D)0(B)-302(t)0(o)-302(participate in distrib)20(uted trans-)]TJ
T*
0.337 Tw
[(actions under a transaction processing monitor lik)10(e)]TJ
T*
0.131 Tw
[(T)45(u)0(x)15(edo from BEA Systems.)-631(Lik)10(e)-381(XA, other standard)]TJ
T*
0.099 Tw
[(interf)10(aces can be b)20(uilt on top of the core system.)-599(The)]TJ
T*
0.085 Tw
[(standards do not belong inside Berk)10(ele)15(y)-335(DB, since not)]TJ
T*
0 Tw
(all applications need them.)Tj
/F8 1 Tf
12 0 0 12 323.2 139.2 Tm
[(3.2. )-250(W)75(orking with r)18(ecords)]TJ
/F6 1 Tf
10 0 0 10 323.2 123 Tm
0.063 Tw
[(A)-313(database user may need to search for particular k)10(e)15(ys)]TJ
T*
0.091 Tw
[(in a database, or may simply w)10(ant to bro)25(wse a)20(v)25(ailable)]TJ
T*
0.16 Tw
[(records. )-250(Berk)10(ele)15(y)-410(D)0(B)-410(supports both k)10(e)15(yed access, to)]TJ
T*
0.017 Tw
[(\214nd one or more records with a gi)25(v)15(e)0(n)-267(k)10(e)15(y)65(,)-267(o)0(r)-267(sequential)]TJ
T*
0.053 Tw
[(access, to retrie)25(v)15(e)15( )-15(all the records in the database one at)]TJ
ET
endstream
endobj
15 0 obj
<<
/ProcSet [/PDF /Text ]
/Font <<
/F2 4 0 R
/F4 5 0 R
/F6 6 0 R
/F8 7 0 R
>>
/ExtGState <<
/GS1 8 0 R
>>
>>
endobj
17 0 obj
<<
/Length 7813
>>
stream
BT
/F6 1 Tf
10 0 0 10 79.2 708 Tm
0 g
/GS1 gs
0 Tc
0.384 Tw
[(a)-634(time. )-250(The)-634(order of the records returned during)]TJ
0 -1.2 TD
0.021 Tw
[(sequential scans depends on the access method.)-521(B+tree)]TJ
T*
0.149 Tw
[(and Recno databases return records in sort order)40(,)-399(and)]TJ
T*
0.002 Tw
[(Hash databases return them in apparently random order)55(.)]TJ
0 -1.62 TD
0.246 Tw
[(Similarly)65(,)-496(Berk)10(ele)15(y)-496(D)0(B)-496(de\214nes simple interf)10(aces for)]TJ
0 -1.2 TD
0 Tw
(inserting, updating, and deleting records in a database.)Tj
/F8 1 Tf
12 0 0 12 79.2 613.8 Tm
[(3.3. )-250(Long)-250(k)10(eys and v)10(alues)]TJ
/F6 1 Tf
10 0 0 10 79.2 597.6 Tm
0.105 Tw
[(Berk)10(ele)15(y)-355(D)0(B)-355(manages k)10(e)15(ys)-355(and v)25(alues as lar)18(ge as 2)]TJ
8 0 0 8 295.187 602.6 Tm
(32)Tj
10 0 0 10 79.2 585.6 Tm
0.069 Tw
[(bytes. )-250(Since)-319(the time required to cop)10(y)-319(a)-319(record is pro-)]TJ
T*
0.189 Tw
[(portional to its size, Berk)10(ele)15(y)-440(D)0(B)-440(includes interf)10(aces)]TJ
T*
0.451 Tw
[(that operate on partial records.)-951(If an application)]TJ
T*
0.127 Tw
[(requires only part of a lar)18(ge record, it requests partial)]TJ
T*
0.002 Tw
[(record retrie)25(v)25(al, and recei)25(v)15(e)0(s)-253(just the bytes that it needs.)]TJ
T*
0 Tw
[(The smaller cop)10(y)-250(s)0(a)20(v)20( )245(es)-250(both time and memory)65(.)]TJ
0 -1.62 TD
0.071 Tw
[(Berk)10(ele)15(y)-321(D)0(B)-321(allo)25(ws the programmer to de\214ne the data)]TJ
0 -1.2 TD
0.272 Tw
[(types of k)10(e)15(ys)-522(and v)25(alues. )-250(De)25(v)15(elopers use an)15(y)-522(type)]TJ
T*
0 Tw
[(e)15(xpressible in the programming language.)]TJ
/F8 1 Tf
12 0 0 12 79.2 455.4 Tm
0.25 Tw
[(3.4. Lar)10(ge )250(databases)]TJ
/F6 1 Tf
10 0 0 10 79.2 439.2 Tm
0.075 Tw
[(A)-325(single database managed by Berk)10(ele)15(y)-326(D)0(B)-326(can be up)]TJ
T*
0.172 Tw
(to 2)Tj
8 0 0 8 96.195 432.2 Tm
(48)Tj
10 0 0 10 108.411 427.2 Tm
[(bytes, or 256 petabytes, in size.)-671(Berk)10(ele)15(y)-421(D)0(B)]TJ
-2.9211 -1.2 TD
0.214 Tw
(uses the host \214lesystem as the backing store for the)Tj
T*
0.267 Tw
[(database, so lar)18(ge databases require big \214le support)]TJ
T*
0.311 Tw
[(from the operating system.)-811(Sleep)10(ycat Softw)10(are has)]TJ
T*
0.571 Tw
[(customers using Berk)10(ele)15(y)-821(D)0(B)-821(t)0(o)-821(manage single)]TJ
T*
0 Tw
[(databases in e)15(xcess of 100 gigabytes.)]TJ
/F8 1 Tf
12 0 0 12 79.2 337.2 Tm
0.25 Tw
[(3.5. Main)-250(memory )250(databases)]TJ
/F6 1 Tf
10 0 0 10 79.2 321 Tm
0.117 Tw
(Applications that do not require persistent storage can)Tj
T*
0.012 Tw
[(create databases that e)15(xist only in main memory)65(.)-512(These)]TJ
T*
0.054 Tw
[(databases bypass the o)15(v)15(erhead imposed by the I/O sys-)]TJ
T*
0 Tw
[(tem altogether)55(.)]TJ
0 -1.62 TD
0.214 Tw
(Some applications do need to use disk as a backing)Tj
0 -1.2 TD
0.225 Tw
[(store, b)20(ut run on machines with v)15(ery lar)18(ge memory)65(.)]TJ
T*
0.03 Tw
[(Berk)10(ele)15(y)-280(D)0(B)-280(i)0(s)-280(able to manage v)15(ery lar)18(ge shared mem-)]TJ
T*
0.013 Tw
[(ory re)15(gions for cached data pages, log records, and lock)]TJ
T*
0.144 Tw
[(management. )-250(F)15(or e)15(xample, the cache re)15(gion used for)]TJ
T*
0.003 Tw
[(data pages may be gigabytes in size, reducing the lik)10(eli-)]TJ
T*
0.064 Tw
[(hood that an)15(y)-314(read operation will need to visit the disk)]TJ
T*
0.12 Tw
[(in the steady state.)-620(The programmer declares the size)]TJ
T*
0 Tw
[(of the cache re)15(gion at startup.)]TJ
0 -1.62 TD
0.455 Tw
[(Finally)65(,)-705(man)15(y)-705(operating systems pro)15(vide memory-)]TJ
0 -1.2 TD
0.253 Tw
[(mapped \214le services that are much f)10(aster than their)]TJ
T*
0.26 Tw
[(general-purpose \214le system interf)10(aces. )-250(Berk)10(ele)15(y)-510(D)0(B)]TJ
T*
0.512 Tw
(can memory-map its database \214les for read-only)Tj
T*
0.392 Tw
[(database use.)-892(The application operates on records)]TJ
T*
0.207 Tw
(stored directly on the pages, with no cache manage-)Tj
T*
0.156 Tw
[(ment o)15(v)15(erhead. )-250(Because)-406(the application gets pointers)]TJ
24.4 62.34 TD
0.126 Tw
[(directly into the Berk)10(ele)15(y)-376(D)0(B)-376(pages, writes cannot be)]TJ
0 -1.2 TD
0.127 Tw
[(permitted. )-250(Otherwise,)-377(changes could bypass the lock-)]TJ
T*
0.023 Tw
[(ing and logging systems, and softw)10(are errors could cor)20(-)]TJ
T*
0.401 Tw
[(rupt the database.)-901(Read-only applications can use)]TJ
T*
0.039 Tw
[(Berk)10(ele)15(y)-289(DB')55(s)-289(memory-mapped \214le service to impro)15(v)15(e)]TJ
T*
0 Tw
(performance on most architectures.)Tj
/F8 1 Tf
12 0 0 12 323.2 618 Tm
0.25 Tw
[(3.6. Con\214gurable)-250(page )250(size)]TJ
/F6 1 Tf
10 0 0 10 323.2 601.8 Tm
0.011 Tw
(Programmers declare the size of the pages used by their)Tj
T*
0.04 Tw
[(access methods when the)15(y)-290(create a database.)-540(Although)]TJ
T*
0.155 Tw
[(Berk)10(ele)15(y)-405(D)0(B)-405(pro)15(vides reasonable def)10(aults, de)25(v)15(elopers)]TJ
T*
0.364 Tw
[(may o)15(v)15(erride them to control system performance.)]TJ
T*
0.079 Tw
(Small pages reduce the number of records that \214t on a)Tj
T*
0.035 Tw
[(single page.)-535(Fe)25(wer records on a page means that fe)25(wer)]TJ
T*
0.072 Tw
[(records are lock)10(ed when the page is lock)10(ed, impro)15(ving)]TJ
T*
0.146 Tw
[(concurrenc)15(y)65(.)65( )-315(The per)20(-page o)15(v)15(erhead is proportionally)]TJ
T*
0.229 Tw
[(higher with smaller pages, of course, b)20(ut de)25(v)15(elopers)]TJ
T*
0 Tw
[(can trade of)25(f)-250(space for time as an application requires.)]TJ
/F8 1 Tf
12 0 0 12 323.2 463.8 Tm
0.25 Tw
[(3.7. Small)-250(f)25(ootprint)]TJ
/F6 1 Tf
10 0 0 10 323.2 447.6 Tm
0.147 Tw
[(Berk)10(ele)15(y)-397(D)0(B)-397(i)0(s)-397(a)-397(compact system.)-647(The full package,)]TJ
T*
0.083 Tw
[(including all access methods, reco)15(v)15(erability)65(,)-333(and trans-)]TJ
T*
0.123 Tw
[(action support is roughly 175K of te)15(xt space on com-)]TJ
T*
0 Tw
(mon architectures.)Tj
/F8 1 Tf
12 0 0 12 323.2 381.6 Tm
0.25 Tw
(3.8. Cursors)Tj
/F6 1 Tf
10 0 0 10 323.2 365.4 Tm
0.157 Tw
[(In database terminology)65(,)-407(a)-407(cursor is a pointer into an)]TJ
T*
0.181 Tw
[(access method that can be called iterati)25(v)15(ely to return)]TJ
T*
0.368 Tw
[(records in sequence.)-868(Berk)10(ele)15(y)-618(D)0(B)-618(includes cursor)]TJ
T*
0.281 Tw
[(interf)10(aces for all access methods.)-781(This permits, for)]TJ
T*
0.034 Tw
[(e)15(xample, users to tra)20(v)15(erse a B+tree and vie)25(w)-284(records in)]TJ
T*
0.123 Tw
[(order)55(.)-623(Pointers to records in cursors are persistent, so)]TJ
T*
0.178 Tw
(that once fetched, a record may be updated in place.)Tj
T*
0.194 Tw
[(Finally)65(,)-444(cursors support access to chains of duplicate)]TJ
T*
0 Tw
[(data items in the v)25(arious access methods.)]TJ
/F8 1 Tf
12 0 0 12 323.2 239.4 Tm
0.25 Tw
[(3.9. J)15(oins)]TJ
/F6 1 Tf
10 0 0 10 323.2 223.2 Tm
0.27 Tw
[(In database terminology)65(,)-520(a)-520(join is an operation that)]TJ
T*
0.062 Tw
[(spans multiple separate tables \(or in the case of Berk)10(e-)]TJ
T*
0.202 Tw
[(le)15(y)-452(DB, multiple separate DB \214les\).)-702(F)15(o)0(r)-452(e)15(xample, a)]TJ
T*
0.087 Tw
[(compan)15(y)-337(may store information about its customers in)]TJ
T*
0.154 Tw
[(one table and information about sales in another)55(.)-654(A)0(n)]TJ
T*
0.15 Tw
[(application will lik)10(ely w)10(ant to look up sales informa-)]TJ
T*
0.093 Tw
(tion by customer name; this requires matching records)Tj
T*
0.228 Tw
[(in the tw)10(o)-478(tables that share a common customer ID)]TJ
T*
0.001 Tw
[(\214eld. )-250(This)-251(combining of records from multiple tables is)]TJ
T*
0 Tw
(called a join.)Tj
0 -1.62 TD
0.306 Tw
[(Berk)10(ele)15(y)-556(D)0(B)-556(includes interf)10(aces for joining tw)10(o)-556(o)0(r)]TJ
0 -1.2 TD
0 Tw
(more tables.)Tj
ET
endstream
endobj
18 0 obj
<<
/ProcSet [/PDF /Text ]
/Font <<
/F6 6 0 R
/F8 7 0 R
>>
/ExtGState <<
/GS1 8 0 R
>>
>>
endobj
20 0 obj
<<
/Length 8168
>>
stream
BT
/F8 1 Tf
12 0 0 12 79.2 708 Tm
0 g
/GS1 gs
0 Tc
0.25 Tw
[(3.10. T)74(ransactions)]TJ
/F6 1 Tf
10 0 0 10 79.2 691.8 Tm
0 Tw
[(T)35(ransactions ha)20(v)15(e)15( )-15(four properties [Gray93]:)]TJ
8 0 0 8 84.2 675.6 Tm
(\203)Tj
10 0 0 10 104.2 675.6 Tm
0.299 Tw
[(The)15(y)-549(are atomic.)-799(That is, all of the changes)]TJ
0 -1.2 TD
0.147 Tw
(made in a single transaction must be applied at)Tj
T*
0.131 Tw
[(the same instant or not at all.)-631(This permits, for)]TJ
T*
0.356 Tw
[(e)15(xample, the transfer of mone)15(y)-606(between tw)10(o)]TJ
T*
0.368 Tw
(accounts to be accomplished, by making the)Tj
T*
0.127 Tw
(reduction of the balance in one account and the)Tj
T*
0 Tw
(increase in the other into a single, atomic action.)Tj
8 0 0 8 84.2 587.4 Tm
(\203)Tj
10 0 0 10 104.2 587.4 Tm
0.062 Tw
[(The)15(y)-312(must be consistent.)-562(That is, changes to the)]TJ
T*
0.363 Tw
[(database by an)15(y)-613(transaction cannot lea)20(v)15(e)15( )-15(the)]TJ
T*
0 Tw
[(database in an ille)15(gal)-250(o)0(r)-250(corrupt state.)]TJ
8 0 0 8 84.2 547.2 Tm
(\203)Tj
10 0 0 10 104.2 547.2 Tm
0.05 Tw
[(The)15(y)-301(must be isolatable.)-551(Re)15(gardless of the num-)]TJ
T*
0.08 Tw
[(ber of users w)10(orking in the database at the same)]TJ
T*
0.188 Tw
[(time, e)25(v)15(ery user must ha)20(v)15(e)15( )-15(the illusion that no)]TJ
T*
0 Tw
[(other acti)25(vity is going on.)]TJ
8 0 0 8 84.2 495 Tm
(\203)Tj
10 0 0 10 104.2 495 Tm
0.304 Tw
[(The)15(y)-554(must be durable.)-804(Ev)15(en if the disk that)]TJ
T*
0.088 Tw
(stores the database is lost, it must be possible to)Tj
T*
0.017 Tw
[(reco)15(v)15(e)0(r)-267(the database to its last transaction-consis-)]TJ
T*
0 Tw
(tent state.)Tj
-2.5 -1.62 TD
0.249 Tw
[(This combination of properties \212 atomicity)65(,)-499(consis-)]TJ
0 -1.2 TD
0.324 Tw
[(tenc)15(y)65(,)65( )-65(isolation, and durability \212 is referred to as)]TJ
T*
0.346 Tw
[(A)40(CIDity in the literature.)-846(Berk)10(ele)15(y)-596(DB, lik)10(e)-596(most)]TJ
T*
0.099 Tw
[(database systems, pro)15(vides A)40(CIDity using a collection)]TJ
T*
0 Tw
(of core services.)Tj
0 -1.62 TD
0.026 Tw
[(Programmers can choose to use Berk)10(ele)15(y)-276(DB')55(s)-276(transac-)]TJ
0 -1.2 TD
0 Tw
(tion services for applications that need them.)Tj
/F8 1 Tf
12 0 0 12 79.2 336.6 Tm
0.25 Tw
[(3.10.1. Write-ahead)-250(logging)]TJ
/F6 1 Tf
10 0 0 10 79.2 320.4 Tm
0.048 Tw
[(Programmers can enable the logging system when the)15(y)]TJ
T*
0.092 Tw
[(start up Berk)10(ele)15(y)-342(DB. )-250(During)-342(a)-342(transaction, the appli-)]TJ
T*
0.049 Tw
[(cation mak)10(es a series of changes to the database.)-549(Each)]TJ
T*
0.055 Tw
[(change is captured in a log entry)65(,)-305(which holds the state)]TJ
T*
0.021 Tw
(of the database record both before and after the change.)Tj
T*
0.221 Tw
(The log record is guaranteed to be \215ushed to stable)Tj
T*
0.087 Tw
[(storage before an)15(y)-337(o)0(f)-337(the changed data pages are writ-)]TJ
T*
0.149 Tw
[(ten. )-250(This)-399(beha)20(vior \212 writing the log before the data)]TJ
T*
0 Tw
(pages \212 is called)Tj
/F4 1 Tf
7.3316 0 TD
[(write-ahead lo)10(g)10(ging)]TJ
/F6 1 Tf
8.1185 0 TD
(.)Tj
-15.4501 -1.62 TD
0.083 Tw
[(At an)15(y)-333(time during the transaction, the application can)]TJ
/F4 1 Tf
0 -1.2 TD
(commit)Tj
/F6 1 Tf
2.9438 0 TD
0.17 Tw
[(,)-420(making the changes permanent, or)]TJ
/F4 1 Tf
15.5172 0 TD
[(r)45(oll bac)20(k)]TJ
/F6 1 Tf
3.6879 0 TD
0 Tw
(,)Tj
-22.1489 -1.2 TD
0.085 Tw
(cancelling all changes and restoring the database to its)Tj
T*
0.157 Tw
[(pre-transaction state.)-657(If the application rolls back the)]TJ
T*
0.1 Tw
(transaction, then the log holds the state of all changed)Tj
T*
0.05 Tw
[(pages prior to the transaction, and Berk)10(ele)15(y)-300(D)0(B)-300(simply)]TJ
T*
0.023 Tw
[(restores that state.)-523(If the application commits the trans-)]TJ
T*
0.054 Tw
[(action, Berk)10(ele)15(y)-304(D)0(B)-304(writes the log records to disk.)-554(In-)]TJ
T*
0.231 Tw
(memory copies of the data pages already re\215ect the)Tj
T*
0.14 Tw
[(changes, and will be \215ushed as necessary during nor)20(-)]TJ
T*
0.235 Tw
[(mal processing.)-735(Since log writes are sequential, b)20(u)0(t)]TJ
T*
0.873 Tw
[(data page writes are random, this impro)15(v)15(e)0(s)]TJ
24.4 63.18 TD
(performance.)Tj
/F8 1 Tf
12 0 0 12 323.2 678 Tm
0.25 Tw
[(3.10.2. Crashes)-250(and )250(r)18(eco)10(v)10(ery)]TJ
/F6 1 Tf
10 0 0 10 323.2 661.8 Tm
0.109 Tw
[(Berk)10(ele)15(y)-359(DB')55(s)-359(write-ahead log is used by the transac-)]TJ
0 -1.2 TD
0.041 Tw
[(tion system to commit or roll back transactions.)-541(It also)]TJ
T*
0.073 Tw
[(gi)25(v)15(e)0(s)-323(the reco)15(v)15(ery system the information that it needs)]TJ
T*
0.082 Tw
(to protect against data loss or corruption from crashes.)Tj
T*
0.02 Tw
[(Berk)10(ele)15(y)-270(D)0(B)-270(i)0(s)-270(able to survi)25(v)15(e)15( )-15(application crashes, sys-)]TJ
T*
0.041 Tw
[(tem crashes, and e)25(v)15(en)-291(catastrophic f)10(ailures lik)10(e)-291(the loss)]TJ
T*
0 Tw
[(of a hard disk, without losing an)15(y)-250(data.)]TJ
0 -1.62 TD
0.054 Tw
[(Survi)25(ving crashes requires data stored in se)25(v)15(eral dif)25(fer)20(-)]TJ
0 -1.2 TD
0.252 Tw
[(ent places.)-752(During normal processing, Berk)10(ele)15(y)-502(D)0(B)]TJ
T*
0.077 Tw
[(has copies of acti)25(v)15(e)15( )-15(log records and recently-used data)]TJ
T*
0.154 Tw
[(pages in memory)65(.)-654(Log records are \215ushed to the log)]TJ
T*
0.069 Tw
[(disk when transactions commit.)-569(Data pages trickle out)]TJ
T*
0.001 Tw
(to the data disk as pages m)Tj
10.7251 0 TD
-0.015 Tc
(ove )Tj
1.6647 0 TD
0 Tc
[(through the b)20(u)0(f)25(fer cache.)]TJ
-12.3898 -1.2 TD
0.019 Tw
[(Periodically)65(,)-269(the system administrator backs up the data)]TJ
T*
0.028 Tw
[(disk, creating a safe cop)10(y)-278(o)0(f)-278(the database at a particular)]TJ
T*
0.011 Tw
[(instant. )-250(When)-261(the database is back)10(ed up, the log can be)]TJ
T*
0.134 Tw
[(truncated. )-250(F)15(or maximum rob)20(ustness, the log disk and)]TJ
T*
0 Tw
[(data disk should be separate de)25(vices.)]TJ
0 -1.62 TD
0.129 Tw
[(Dif)25(ferent system f)10(ailures can destro)10(y)-379(memory)65(,)-379(the log)]TJ
0 -1.2 TD
0.111 Tw
[(disk, or the data disk.)-611(Berk)10(ele)15(y)-361(D)0(B)-361(i)0(s)-361(able to survi)25(v)15(e)]TJ
T*
0.068 Tw
[(the loss of an)15(y)-318(one of these repositories without losing)]TJ
T*
0 Tw
[(an)15(y)-250(committed transactions.)]TJ
0 -1.62 TD
0.137 Tw
[(If the computer')55(s)-387(memory is lost, through an applica-)]TJ
0 -1.2 TD
0.162 Tw
(tion or operating system crash, then the log holds all)Tj
T*
0.179 Tw
[(committed transactions.)-679(On restart, the reco)15(v)15(ery sys-)]TJ
T*
0.049 Tw
[(tem rolls the log forw)10(ard against the database, reapply-)]TJ
T*
0.068 Tw
[(ing an)15(y)-318(changes to on-disk pages that were in memory)]TJ
T*
0.014 Tw
[(at the time of the crash.)-514(Since the log contains pre- and)]TJ
T*
0.096 Tw
[(post-change state for transactions, the reco)15(v)15(ery system)]TJ
T*
0.114 Tw
[(also uses the log to restore an)15(y)-364(pages to their original)]TJ
T*
0.161 Tw
[(state if the)15(y)-411(were modi\214ed by transactions that ne)25(v)15(e)0(r)]TJ
T*
(committed.)Tj
0 -1.62 TD
0.205 Tw
(If the data disk is lost, the system administrator can)Tj
0 -1.2 TD
0.089 Tw
[(restore the most recent cop)10(y)-339(from backup.)-589(The reco)15(v-)]TJ
T*
0.13 Tw
[(ery system will roll the entire log forw)10(ard against the)]TJ
T*
0.264 Tw
(original database, reapplying all committed changes.)Tj
T*
0.436 Tw
[(When it \214nishes, the database will contain e)25(v)15(ery)]TJ
T*
0.053 Tw
[(change made by e)25(v)15(ery transaction that e)25(v)15(er)-303(committed.)]TJ
0 -1.62 TD
0.049 Tw
[(If the log disk is lost, then the reco)15(v)15(ery system can use)]TJ
0 -1.2 TD
0.185 Tw
[(the in-memory copies of log entries to roll back an)15(y)]TJ
T*
0.003 Tw
(uncommitted transactions, \215ush all in-memory database)Tj
T*
0.166 Tw
[(pages to the data disk, and shut do)25(wn gracefully)65(.)-666(A)0(t)]TJ
T*
0.22 Tw
(that point, the system administrator can back up the)Tj
T*
0.004 Tw
[(database disk, install a ne)25(w)-254(log disk, and restart the sys-)]TJ
T*
(tem.)Tj
ET
endstream
endobj
21 0 obj
<<
/ProcSet [/PDF /Text ]
/Font <<
/F4 5 0 R
/F6 6 0 R
/F8 7 0 R
>>
/ExtGState <<
/GS1 8 0 R
>>
>>
endobj
23 0 obj
<<
/Length 8561
>>
stream
BT
/F8 1 Tf
12 0 0 12 79.2 708 Tm
0 g
/GS1 gs
0 Tc
0.25 Tw
(3.10.3. Checkpoints)Tj
/F6 1 Tf
10 0 0 10 79.2 691.8 Tm
0.358 Tw
[(Berk)10(ele)15(y)-608(D)0(B)-608(includes a checkpointing service that)]TJ
0 -1.2 TD
0.026 Tw
[(interacts with the reco)15(v)15(ery system.)-526(During normal pro-)]TJ
T*
0.241 Tw
(cessing, both the log and the database are changing)Tj
T*
0.092 Tw
[(continually)65(.)-592(A)0(t)-342(a)0(n)15(y)15( )-15(gi)25(v)25( )332(en)-342(instant, the on-disk v)15(ersions)]TJ
T*
0.041 Tw
[(of the tw)10(o)-291(are not guaranteed to be consistent.)-541(The log)]TJ
T*
0.384 Tw
(probably contains changes that are not yet in the)Tj
T*
(database.)Tj
0 -1.62 TD
0.008 Tw
[(When an application mak)10(es a)]TJ
/F4 1 Tf
12.057 0 TD
[(c)15(hec)20(kpoint)]TJ
/F6 1 Tf
4.2967 0 TD
[(,)-259(all committed)]TJ
-16.3537 -1.2 TD
0.044 Tw
(changes in the log up to that point are guaranteed to be)Tj
T*
0.063 Tw
[(present on the data disk, too.)-563(Checkpointing is moder)20(-)]TJ
T*
0.004 Tw
[(ately e)15(xpensi)25(v)15(e)15( )-16(during normal processing, b)20(ut limits the)]TJ
T*
0 Tw
[(time spent reco)15(v)15(ering from crashes.)]TJ
0 -1.62 TD
0.312 Tw
(After an application or operating system crash, the)Tj
0 -1.2 TD
0.742 Tw
[(reco)15(v)15(ery system only needs to go back tw)10(o)]TJ
0 -1.4 TD
(checkpoints)Tj
7 0 0 7 126.967 517.4 Tm
0 Tw
(1)Tj
10 0 0 10 134.343 513.4 Tm
0.138 Tw
[(to start rolling the log forw)10(ard. )-249(W)40(ithout)]TJ
-5.5143 -1.2 TD
0.326 Tw
[(checkpoints, there is no w)10(ay to be sure ho)25(w)-576(long)]TJ
T*
0.039 Tw
[(restarting after a crash will tak)10(e. )-250(W)40(ith checkpoints, the)]TJ
T*
0.009 Tw
[(restart interv)25(al can be \214x)15(ed by the programmer)55(.)-509(Reco)15(v-)]TJ
T*
0.067 Tw
(ery processing can be guaranteed to complete in a sec-)Tj
T*
0 Tw
[(ond or tw)10(o.)]TJ
0 -1.62 TD
0.246 Tw
[(Softw)10(are crashes are much more common than disk)]TJ
0 -1.2 TD
0.088 Tw
[(f)10(ailures. )-250(Man)15(y)-338(d)0(e)25(v)25( )328(elopers w)10(ant to guarantee that soft-)]TJ
T*
0.016 Tw
[(w)10(are b)20(ugs do not destro)10(y)-266(data, b)20(ut are willing to restore)]TJ
T*
0.063 Tw
[(from tape, and to tolerate a day or tw)10(o)-313(o)0(f)-313(lost w)10(ork, in)]TJ
T*
0.089 Tw
[(the unlikle)15(y)-339(e)25(v)15(ent of a disk crash.)-589(W)40(ith Berk)10(ele)15(y)-339(DB,)]TJ
T*
0.109 Tw
[(programmers may truncate the log at checkpoints.)-609(As)]TJ
T*
0.009 Tw
[(long as the tw)10(o)-259(most recent checkpoints are present, the)]TJ
T*
0.011 Tw
[(reco)15(v)15(ery system can guarantee that no committed trans-)]TJ
T*
0.061 Tw
[(actions are lost after a softw)10(are crash.)-561(In this case, the)]TJ
T*
0.144 Tw
[(reco)15(v)15(ery system does not require that the log and the)]TJ
T*
0.133 Tw
[(data be on separate de)25(vices, although separating them)]TJ
T*
0 Tw
(can still impr)Tj
5.2777 0 TD
-0.015 Tc
(ove )Tj
1.6639 0 TD
0 Tc
(performance by spreading out writes.)Tj
/F8 1 Tf
12 0 0 12 79.2 275.2 Tm
0.25 Tw
[(3.10.4. T)74(w)10(o-phase )250(locking)]TJ
/F6 1 Tf
10 0 0 10 79.2 259 Tm
0.191 Tw
[(Berk)10(ele)15(y)-442(D)0(B)-442(pro)15(vides a service kno)25(wn as tw)10(o-phase)]TJ
0 -1.2 TD
0.052 Tw
[(locking. )-250(In)-302(order to reduce the lik)10(elihood of deadlocks)]TJ
T*
0.255 Tw
[(and to guarantee A)40(CID properties, database systems)]TJ
T*
0.006 Tw
[(manage locks in tw)10(o)-256(phases. )-250(First,)-256(during the operation)]TJ
T*
0.157 Tw
[(of a transaction, the)15(y)-407(acquire locks, b)20(ut ne)25(v)15(e)0(r)-407(release)]TJ
T*
0.365 Tw
[(them. )-250(Second,)-615(at the end of the transaction, the)15(y)]TJ
T*
0.023 Tw
[(release locks, b)20(ut ne)25(v)15(e)0(r)-273(acquire them.)-523(In practice, most)]TJ
T*
0.469 Tw
[(database systems, including Berk)10(ele)15(y)-719(DB, acquire)]TJ
T*
0.231 Tw
[(locks on demand o)15(v)15(e)0(r)-481(the course of the transaction,)]TJ
T*
0 Tw
(then \215ush the log, then release all locks.)Tj
ET
0 G
1 J 1 j 0.32 w 10 M []0 d
1 i
79.23 141.39 m
223.23 141.39 l
S
BT
5 0 0 5 100.8 131 Tm
(1)Tj
8 0 0 8 105.638 127.8 Tm
0.042 Tw
[(One checkpoint is not f)10(ar enough.)-542(The reco)15(v)15(ery system can-)]TJ
-3.3048 -1.2 TD
0.026 Tw
[(not be sure that the most recent checkpoint completed \212 it may ha)20(v)15(e)]TJ
T*
0.092 Tw
[(been interrupted by the crash that forced the reco)15(v)15(ery system to run)]TJ
T*
0 Tw
(in the \214rst place.)Tj
10 0 0 10 323.2 708 Tm
0.081 Tw
[(Berk)10(ele)15(y)-331(D)0(B)-331(can lock entire database \214les, which cor)20(-)]TJ
T*
0.084 Tw
[(respond to tables, or indi)25(vidual pages in them.)-584(It does)]TJ
T*
0.214 Tw
[(no record-le)25(v)15(e)0(l)-464(locking. )-250(By)-464(shrinking the page size,)]TJ
T*
0.363 Tw
[(ho)25(we)25(v)15(e)0(r)40(,)40( )-40(de)25(v)25( )603(elopers can guarantee that e)25(v)15(ery page)]TJ
T*
0.21 Tw
[(holds only a small number of records.)-710(This reduces)]TJ
T*
(contention.)Tj
0 -1.62 TD
0.039 Tw
(If locking is enabled, then read and write operations on)Tj
0 -1.2 TD
0.282 Tw
[(a)-532(database acquire tw)10(o-phase locks, which are held)]TJ
T*
0.363 Tw
[(until the transaction completes.)-863(Which objects are)]TJ
T*
0.074 Tw
[(lock)10(ed and the order of lock acquisition depend on the)]TJ
T*
0.05 Tw
[(w)10(orkload for each transaction.)-550(It is possible for tw)10(o)-300(o)0(r)]TJ
T*
0.131 Tw
[(more transactions to deadlock, so that each is w)10(aiting)]TJ
T*
0 Tw
[(for a lock that is held by another)55(.)]TJ
0 -1.62 TD
0.081 Tw
[(Berk)10(ele)15(y)-331(D)0(B)-331(detects deadlocks and automatically rolls)]TJ
0 -1.2 TD
0.182 Tw
[(back one of the transactions.)-682(This releases the locks)]TJ
T*
0.192 Tw
[(that it held and allo)25(ws the other transactions to con-)]TJ
T*
0.085 Tw
[(tinue. )-250(The)-335(caller is noti\214ed that its transaction did not)]TJ
T*
0.175 Tw
[(complete, and may restart it.)-675(De)25(v)15(elopers can specify)]TJ
T*
0.065 Tw
[(the deadlock detection interv)25(al and the polic)15(y)-315(t)0(o)-315(use in)]TJ
T*
0 Tw
(choosing a transaction to roll back.)Tj
0 -1.62 TD
0.669 Tw
[(The tw)10(o-phase locking interf)10(aces are separately)]TJ
0 -1.2 TD
0.093 Tw
[(callable by applications that link Berk)10(ele)15(y)-343(DB, though)]TJ
T*
0.314 Tw
[(fe)25(w)-564(users ha)20(v)15(e)15( )-15(needed to use that f)10(acility directly)65(.)]TJ
T*
0.221 Tw
[(Using these interf)10(aces, Berk)10(ele)15(y)-471(D)0(B)-471(pro)15(vides a f)10(ast,)]TJ
T*
0.24 Tw
(platform-portable locking system for general-purpose)Tj
T*
0.042 Tw
[(use. )-250(It)-292(also lets users include non-database objects in a)]TJ
T*
0.35 Tw
(database transaction, by controlling access to them)Tj
T*
0 Tw
[(e)15(xactly as if the)15(y)-250(were inside the database.)]TJ
0 -1.62 TD
0.058 Tw
[(The Berk)10(ele)15(y)-308(D)0(B)-308(t)0(w)10(o-phase locking f)10(acility is b)20(uilt on)]TJ
0 -1.2 TD
0.061 Tw
[(the f)10(astest correct locking primiti)25(v)15(e)0(s)-311(that are supported)]TJ
T*
0.197 Tw
[(by the underlying architecture.)-697(In the current imple-)]TJ
T*
0.059 Tw
[(mentation, this means that the locking system is dif)25(fer)20(-)]TJ
T*
0.171 Tw
[(ent on the v)25(arious UNIX platforms, and is still more)]TJ
T*
0.069 Tw
[(dif)25(ferent on W)40(indo)25(ws NT)74(.)-569(I)0(n)-319(our e)15(xperience, the most)]TJ
T*
0.263 Tw
[(dif)25(\214cult aspect of performance tuning is \214nding the)]TJ
T*
0.088 Tw
[(f)10(astest locking primiti)25(v)15(e)0(s)-338(that w)10(ork correctly on a par)20(-)]TJ
T*
0.126 Tw
[(ticular architecture and then inte)15(grating the ne)25(w)-376(inter)20(-)]TJ
T*
0 Tw
[(f)10(ace with the se)25(v)15(eral that we already support.)]TJ
0 -1.62 TD
0.054 Tw
[(The w)10(orld w)10(ould be a better place if the operating sys-)]TJ
0 -1.2 TD
0.21 Tw
[(tems community w)10(ould uniformly implement POSIX)]TJ
T*
0.131 Tw
[(locking primiti)25(v)15(e)0(s)-381(and w)10(ould guarantee that acquiring)]TJ
T*
0.108 Tw
[(an uncontested lock w)10(as a f)10(ast operation.)-608(Locks must)]TJ
T*
0.364 Tw
[(w)10(ork both among threads in a single process and)]TJ
T*
0 Tw
(among processes.)Tj
/F8 1 Tf
12 0 0 12 323.2 141 Tm
0.25 Tw
[(3.11. Concurr)18(ency)]TJ
/F6 1 Tf
10 0 0 10 323.2 124.8 Tm
0.038 Tw
(Good performance under concurrent operation is a crit-)Tj
T*
0.077 Tw
[(ical design point for Berk)10(ele)15(y)-327(DB. )-249(Although)-326(Berk)10(ele)15(y)]TJ
T*
0.196 Tw
(DB is itself not multi-threaded, it is thread-safe, and)Tj
T*
0.055 Tw
[(runs well in threaded applications.)-555(Philosophically)65(,)-305(w)0(e)]TJ
T*
0.226 Tw
[(vie)25(w)-476(the use of threads and the choice of a threads)]TJ
ET
endstream
endobj
24 0 obj
<<
/ProcSet [/PDF /Text ]
/Font <<
/F4 5 0 R
/F6 6 0 R
/F8 7 0 R
>>
/ExtGState <<
/GS1 8 0 R
>>
>>
endobj
26 0 obj
<<
/Length 8362
>>
stream
BT
/F6 1 Tf
10 0 0 10 79.2 708 Tm
0 g
/GS1 gs
0 Tc
0.006 Tw
[(package as a polic)15(y)-257(decision, and prefer to of)25(fer mecha-)]TJ
0 -1.2 TD
0.004 Tw
[(nism \(the ability to run threaded or not\), allo)25(wing appli-)]TJ
T*
0 Tw
[(cations to choose their o)25(wn policies.)]TJ
0 -1.62 TD
0.195 Tw
[(The locking, logging, and b)20(u)0(f)25(fer pool subsystems all)]TJ
0 -1.2 TD
0.071 Tw
[(use shared memory or other OS-speci\214c sharing f)10(acili-)]TJ
T*
0.171 Tw
[(ties to communicate.)-671(Locks, b)20(u)0(f)25(fer pool fetches, and)]TJ
T*
0.106 Tw
[(log writes beha)20(v)15(e)15( )-15(in)-356(the same w)10(ay across threads in a)]TJ
T*
0.003 Tw
[(single process as the)15(y)-253(d)0(o)-253(across dif)25(ferent processes on a)]TJ
T*
0 Tw
(single machine.)Tj
0 -1.62 TD
0.09 Tw
(As a result, concurrent database applications may start)Tj
0 -1.2 TD
0.165 Tw
[(up a ne)25(w)-415(process for e)25(v)15(ery single user)40(,)-415(may create a)]TJ
T*
0.285 Tw
[(single serv)15(er which spa)15(wns a ne)25(w)-535(thread for e)25(v)15(ery)]TJ
T*
0 Tw
[(client request, or may choose an)15(y)-250(polic)15(y)-250(i)0(n)-250(between.)]TJ
0 -1.62 TD
0.113 Tw
[(Berk)10(ele)15(y)-363(D)0(B)-363(has been carefully designed to minimize)]TJ
0 -1.2 TD
0.007 Tw
[(contention and maximize concurrenc)15(y)65(.)65( )-315(The cache man-)]TJ
T*
0.057 Tw
[(ager allo)25(ws all threads or processes to bene\214t from I/O)]TJ
T*
0.292 Tw
[(done by one.)-792(Shared resources must sometimes be)]TJ
T*
0.18 Tw
[(lock)10(ed for e)15(xclusi)25(v)15(e)15( )-15(access by one thread of control.)]TJ
T*
0.016 Tw
[(W)80(e)80( )-80(ha)20(v)20( )261(e)-266(k)10(ept critical sections small, and are careful not)]TJ
T*
0.12 Tw
(to hold critical resource locks across system calls that)Tj
T*
0.054 Tw
[(could deschedule the locking thread or process.)-554(Sleep-)]TJ
T*
0.098 Tw
[(ycat Softw)10(are has customers with hundreds of concur)20(-)]TJ
T*
0 Tw
[(rent users w)10(orking on a single database in production.)]TJ
/F8 1 Tf
12 0 0 12 79.2 401.4 Tm
0.25 Tw
[(4. Engineering)-250(Philosoph)15(y)]TJ
/F6 1 Tf
10 0 0 10 79.2 385.2 Tm
0.15 Tw
[(Fundamentally)65(,)-400(Berk)10(ele)15(y)-400(D)0(B)-400(i)0(s)-400(a)-400(collection of access)]TJ
T*
0.019 Tw
[(methods with important f)10(acilities, lik)10(e)-269(logging, locking,)]TJ
T*
0.125 Tw
[(and transactional access underlying them.)-625(In both the)]TJ
T*
0.099 Tw
[(research and the commercial w)10(orld, the techniques for)]TJ
T*
0.273 Tw
[(b)20(uilding systems lik)10(e)-523(Berk)10(ele)15(y)-523(D)0(B)-523(h)0(a)20(v)20( )518(e)-523(been well-)]TJ
T*
0 Tw
[(kno)25(wn for a long time.)]TJ
0 -1.62 TD
0.044 Tw
[(The k)10(e)15(y)15( )-15(adv)25(antage of Berk)10(ele)15(y)-294(D)0(B)-294(i)0(s)-294(the careful atten-)]TJ
0 -1.2 TD
0.106 Tw
(tion that has been paid to engineering details through-)Tj
T*
0.104 Tw
[(out its life.)-604(W)80(e)80( )-80(ha)20(v)20( )349(e)-354(carefully designed the system so)]TJ
T*
0.045 Tw
[(that the core f)10(acilities, lik)10(e)-295(locking and I/O, surf)10(ace the)]TJ
T*
0.097 Tw
[(right interf)10(aces and are otherwise opaque to the caller)55(.)]TJ
T*
0.029 Tw
[(As programmers, we understand the v)25(alue of simplicity)]TJ
T*
0.02 Tw
[(and ha)20(v)15(e)15( )-16(w)10(ork)10(ed hard to simplify the interf)10(aces we sur)20(-)]TJ
T*
0 Tw
[(f)10(ace to users of the database system.)]TJ
0 -1.62 TD
0.203 Tw
[(Berk)10(ele)15(y)-453(D)0(B)-453(a)20(v)20(oids limits in the code.)-703(It places no)]TJ
0 -1.2 TD
0.047 Tw
[(practical limit on the size of k)10(e)15(ys, v)25(alues, or databases;)]TJ
T*
0 Tw
[(the)15(y)-250(may gro)25(w)-250(t)0(o)-250(occup)10(y)-250(the a)20(v)25(ailable storage space.)]TJ
0 -1.62 TD
0.186 Tw
[(The locking and logging subsystems ha)20(v)15(e)15( )-15(been care-)]TJ
0 -1.2 TD
0.018 Tw
(fully crafted to reduce contention and impr)Tj
17.2724 0 TD
-0.015 Tc
(ove )Tj
1.6823 0 TD
0 Tc
(through-)Tj
-18.9546 -1.2 TD
0.216 Tw
(put by shrinking or eliminating critical sections, and)Tj
T*
0 Tw
[(reducing the sizes of lock)10(ed re)15(gions and log entries.)]TJ
0 -1.62 TD
0.224 Tw
(There is nothing in the design or implementation of)Tj
0 -1.2 TD
0.032 Tw
[(Berk)10(ele)15(y)-282(D)0(B)-282(that pushes the state of the art in database)]TJ
T*
0.104 Tw
[(systems. )-250(Rather)40(,)-354(w)0(e)-354(h)0(a)20(v)20( )349(e)-354(been v)15(ery careful to get the)]TJ
T*
0.432 Tw
[(engineering right.)-932(The result is a system that is)]TJ
24.4 62.76 TD
0.037 Tw
[(superior)40(,)-287(a)0(s)-287(a)0(n)-287(embedded database system, to an)15(y)-287(other)]TJ
0 -1.2 TD
0 Tw
[(solution a)20(v)25(ailable.)]TJ
0 -1.62 TD
0.081 Tw
[(Most database systems trade of)25(f)-331(simplicity for correct-)]TJ
0 -1.2 TD
0.165 Tw
[(ness. )-250(Either)-415(the system is easy to use, or it supports)]TJ
T*
0.117 Tw
[(concurrent use and survi)25(v)15(e)0(s)-367(system f)10(ailures. )-250(Berk)10(ele)15(y)]TJ
T*
0.101 Tw
(DB, because of its careful design and implementation,)Tj
T*
0 Tw
[(of)25(fers both simplicity and correctness.)]TJ
0 -1.62 TD
0.076 Tw
[(The system has a small footprint, mak)10(es simple opera-)]TJ
0 -1.2 TD
0.101 Tw
[(tions simple to carry out \(inserting a ne)25(w)-351(record tak)10(es)]TJ
T*
0.116 Tw
[(just a fe)25(w)-366(lines of code\), and beha)20(v)15(e)0(s)-366(correctly in the)]TJ
T*
0.053 Tw
[(f)10(ace of hea)20(vy concurrent use, system crashes, and e)25(v)15(en)]TJ
T*
0 Tw
[(catastrophic f)10(ailures lik)10(e)-250(loss of a hard disk.)]TJ
/F8 1 Tf
12 0 0 12 323.2 537.6 Tm
[(5. )-250(The)-250(Berk)10(eley DB 2.x Distrib)20(ution)]TJ
/F6 1 Tf
10 0 0 10 323.2 521.4 Tm
0.167 Tw
[(Berk)10(ele)15(y)-417(D)0(B)-417(i)0(s)-417(distrib)20(uted in source code form from)]TJ
/F4 1 Tf
T*
[(www)74(.sleepycat.com)]TJ
/F6 1 Tf
7.8135 0 TD
0.232 Tw
[(.)-732(Users are free to do)25(wnload and)]TJ
-7.8135 -1.2 TD
0 Tw
[(b)20(uild the softw)10(are, and to use it in their applications.)]TJ
/F8 1 Tf
12 0 0 12 323.2 467.4 Tm
[(5.1. )-250(What)-250(is in the distrib)20(ution)]TJ
/F6 1 Tf
10 0 0 10 323.2 451.2 Tm
0.483 Tw
[(The distrib)20(ution is a compressed archi)25(v)15(e)15( )-15(\214le. )-250(It)]TJ
T*
0.006 Tw
[(includes the source code for the Berk)10(ele)15(y)-256(D)0(B)-256(library)65(,)-256(a)0(s)]TJ
T*
0.045 Tw
(well as documentation, test suites, and supporting utili-)Tj
T*
(ties.)Tj
0 -1.62 TD
0.261 Tw
[(The source code includes b)20(uild support for all sup-)]TJ
0 -1.2 TD
0.025 Tw
[(ported platforms.)-525(On UNIX systems Berk)10(ele)15(y)-275(D)0(B)-275(uses)]TJ
T*
0.128 Tw
(the GNU autocon\214guration tool,)Tj
/F2 1 Tf
13.7602 0 TD
(autoconf)Tj
/F6 1 Tf
4.7997 0 TD
[(,)-378(t)0(o)-378(iden-)]TJ
-18.5599 -1.2 TD
0.099 Tw
[(tify the system and to b)20(uild the library and supporting)]TJ
T*
0.109 Tw
[(utilities. )-250(Berk)10(ele)15(y)-359(D)0(B)-359(includes speci\214c b)20(uild en)40(viron-)]TJ
T*
0.051 Tw
[(ments for other platforms, such as VMS and W)40(indo)25(ws.)]TJ
/F8 1 Tf
12 0 0 12 323.2 309 Tm
0.25 Tw
(5.1.1. Documentation)Tj
/F6 1 Tf
10 0 0 10 323.2 292.8 Tm
0.501 Tw
[(The distrib)20(uted system includes documentation in)]TJ
T*
0.163 Tw
[(HTML format.)-663(The documentation is in tw)10(o)-413(parts: a)]TJ
T*
0.072 Tw
(UNIX-style reference manual for use by programmers,)Tj
T*
0 Tw
(and a reference guide which is tutorial in nature.)Tj
/F8 1 Tf
12 0 0 12 323.2 226.8 Tm
0.25 Tw
[(5.1.2. T)92(est )250(suite)]TJ
/F6 1 Tf
10 0 0 10 323.2 210.6 Tm
0.111 Tw
[(The softw)10(are also includes a complete test suite, writ-)]TJ
T*
0.015 Tw
[(ten in Tcl.)-515(W)80(e)80( )-80(belie)25(v)15(e)15( )-15(that the test suite is a k)10(e)15(y)15( )-15(adv)25(an-)]TJ
T*
0 Tw
[(tage of Berk)10(ele)15(y)-250(D)0(B)-250(o)15(v)15(e)0(r)-250(comparable systems.)]TJ
0 -1.62 TD
0.261 Tw
[(First, the test suite allo)25(ws users who do)25(wnload and)]TJ
0 -1.2 TD
0.173 Tw
[(b)20(uild the softw)10(are to be sure that it is operating cor)20(-)]TJ
T*
[(rectly)65(.)]TJ
0 -1.62 TD
0.089 Tw
[(Second, the test suite allo)25(ws us, lik)10(e)-339(other commercial)]TJ
0 -1.2 TD
0.053 Tw
[(de)25(v)15(elopers of database softw)10(are, to e)15(x)15(ercise the system)]TJ
T*
0.226 Tw
[(thoroughly at e)25(v)15(ery release.)-726(When we learn of ne)25(w)]TJ
T*
0.172 Tw
[(b)20(ugs, we add them to the test suite.)-672(W)80(e)80( )-80(run the test)]TJ
T*
0.569 Tw
[(suite continually during de)25(v)15(elopment c)15(ycles, and)]TJ
ET
endstream
endobj
27 0 obj
<<
/ProcSet [/PDF /Text ]
/Font <<
/F2 4 0 R
/F4 5 0 R
/F6 6 0 R
/F8 7 0 R
>>
/ExtGState <<
/GS1 8 0 R
>>
>>
endobj
29 0 obj
<<
/Length 7983
>>
stream
BT
/F6 1 Tf
10 0 0 10 79.2 708 Tm
0 g
/GS1 gs
0 Tc
0.031 Tw
[(al)10(w)10(ays prior to release.)-531(The result is a much more reli-)]TJ
0 -1.2 TD
0 Tw
(able system by the time it reaches beta release.)Tj
/F8 1 Tf
12 0 0 12 79.2 666 Tm
0.25 Tw
[(5.2. Binary)-250(distrib)20(ution)]TJ
/F6 1 Tf
10 0 0 10 79.2 649.8 Tm
0.089 Tw
[(Sleep)10(ycat mak)10(es compiled libraries and general binary)]TJ
T*
0 Tw
[(distrib)20(utions a)20(v)25(ailable to customers for a fee.)]TJ
/F8 1 Tf
12 0 0 12 79.2 607.8 Tm
0.25 Tw
[(5.3. Supported)-250(platf)25(orms)]TJ
/F6 1 Tf
10 0 0 10 79.2 591.6 Tm
0.312 Tw
[(Berk)10(ele)15(y)-562(D)0(B)-562(runs on an)15(y)-562(operating system with a)]TJ
T*
0.082 Tw
[(POSIX 1003.1 interf)10(ace [IEEE96], which includes vir)20(-)]TJ
T*
0.2 Tw
[(tually e)25(v)15(ery UNIX system.)-700(In addition, the softw)10(are)]TJ
T*
0.285 Tw
[(runs on VMS, W)40(indo)25(ws/95, W)40(indo)25(ws/98, and W)40(in-)]TJ
T*
0.521 Tw
[(do)25(ws/NT)74(.)-1021(Sleep)10(ycat Softw)10(are no longer supports)]TJ
T*
0 Tw
[(deplo)10(yment on sixteen-bit W)40(indo)25(ws systems.)]TJ
/F8 1 Tf
12 0 0 12 79.2 501.6 Tm
[(6. )-250(Berk)10(eley DB 2.x Licensing)]TJ
/F6 1 Tf
10 0 0 10 79.2 485.4 Tm
0.013 Tw
[(Berk)10(ele)15(y)-263(D)0(B)-263(2.x is distrib)20(uted as an Open Source prod-)]TJ
T*
0.221 Tw
[(uct. )-250(The)-471(softw)10(are is freely a)20(v)25(ailable from us at our)]TJ
T*
0.087 Tw
[(W)80(e)0(b)-337(site, and in other media.)-587(Users are free to do)25(wn-)]TJ
T*
0 Tw
[(load the softw)10(are and b)20(uild applications with it.)]TJ
0 -1.62 TD
0.102 Tw
[(The 1.x v)15(ersions of Berk)10(ele)15(y)-352(D)0(B)-352(were co)15(v)15(ered by the)]TJ
0 -1.2 TD
0.376 Tw
[(UC Berk)10(ele)15(y)-626(cop)10(yright that co)15(v)15(ers softw)10(are freely)]TJ
T*
0.174 Tw
[(redistrib)20(utable in source form.)-674(When Sleep)10(ycat Soft-)]TJ
T*
0.091 Tw
[(w)10(are w)10(as formed, we needed to draft a license consis-)]TJ
T*
0.232 Tw
[(tent with the cop)10(yright go)15(v)15(erning the e)15(xisting, older)]TJ
T*
0.283 Tw
[(softw)10(are. )-250(Because)-533(of important dif)25(ferences between)]TJ
T*
0.05 Tw
[(the UC Berk)10(ele)15(y)-300(cop)10(yright and the GPL, it w)10(as impos-)]TJ
T*
0.088 Tw
[(sible for us to use the GPL.)-588(A)-338(second cop)10(yright, with)]TJ
T*
0.087 Tw
[(terms contradictory to the \214rst, simply w)10(ould not ha)20(v)15(e)]TJ
T*
[(w)10(ork)10(ed.)]TJ
0 -1.62 TD
0.253 Tw
[(Sleep)10(ycat w)10(anted to continue Open Source de)25(v)15(elop-)]TJ
0 -1.2 TD
0.208 Tw
[(ment of Berk)10(ele)15(y)-458(D)0(B)-458(for se)25(v)15(eral reasons.)-708(W)80(e)80( )-80(agree)]TJ
T*
0.085 Tw
(with Raymond [Raym98] and others that Open Source)Tj
T*
0.076 Tw
[(softw)10(are is typically of higher quality than proprietary)65(,)]TJ
T*
0.262 Tw
[(binary-only products.)-762(Our customers bene\214t from a)]TJ
T*
0.098 Tw
[(community of de)25(v)15(elopers who kno)25(w)-348(and use Berk)10(ele)15(y)]TJ
T*
0.132 Tw
[(DB, and can help with application design, deb)20(ugging,)]TJ
T*
0.165 Tw
[(and performance tuning.)-665(W)40(idespread distrib)20(ution and)]TJ
T*
0.102 Tw
[(use of the source code tends to isolate b)20(ugs early)65(,)-352(and)]TJ
T*
0.003 Tw
[(to get \214x)15(es back into the distrib)20(uted system quickly)65(.)-503(A)0(s)]TJ
T*
0.105 Tw
[(a)-355(result, Berk)10(ele)15(y)-355(D)0(B)-355(i)0(s)-355(more reliable.)-605(Just as impor)20(-)]TJ
T*
0.119 Tw
[(tantly)65(,)-369(indi)25(vidual users are able to contrib)20(ute ne)25(w)-369(fea-)]TJ
T*
0.106 Tw
(tures and performance enhancements, to the bene\214t of)Tj
T*
0.036 Tw
[(e)25(v)25( )276(eryone who uses Berk)10(ele)15(y)-286(DB. )-250(From)-286(a)-286(b)20(usiness per)20(-)]TJ
T*
0.061 Tw
[(specti)25(v)15(e)0(,)-311(Open Source and free distrib)20(ution of the soft-)]TJ
T*
0.16 Tw
[(w)10(are creates share for us, and gi)25(v)15(e)0(s)-410(u)0(s)-410(a)-410(mark)10(et into)]TJ
T*
0.041 Tw
[(which we can sell products and services.)-541(Finally)65(,)-291(mak-)]TJ
T*
0.015 Tw
[(ing the source code freely a)20(v)25(ailable reduces our support)]TJ
T*
0.244 Tw
[(load, since customers can \214nd and \214x b)20(ugs without)]TJ
T*
0 Tw
[(recourse to us, in man)15(y)-250(cases.)]TJ
24.4 62.7 TD
0.313 Tw
[(T)80(o)80( )-80(preserv)15(e)-563(the Open Source heritage of the older)]TJ
0 -1.2 TD
0.05 Tw
[(Berk)10(ele)15(y)-300(D)0(B)-300(code, we drafted a ne)25(w)-300(license go)15(v)15(erning)]TJ
T*
0.042 Tw
[(the distrib)20(ution of Berk)10(ele)15(y)-292(D)0(B)-292(2.x. )-250(W)80(e)-292(adopted terms)]TJ
T*
[(from the GPL that mak)10(e)-291(i)0(t)-291(impossible to turn our Open)]TJ
T*
0.129 Tw
[(Source code into proprietary code o)25(wned by someone)]TJ
T*
(else.)Tj
0 -1.62 TD
0.068 Tw
[(Brie\215y)65(,)-318(the terms go)15(v)15(erning the use and distrib)20(ution of)]TJ
0 -1.2 TD
[(Berk)10(ele)15(y)-250(D)0(B)-250(are:)]TJ
8 0 0 8 328.2 603.6 Tm
0 Tw
(\203)Tj
10 0 0 10 348.2 603.6 Tm
(your application must be internal to your site, or)Tj
8 0 0 8 328.2 587.4 Tm
(\203)Tj
10 0 0 10 348.2 587.4 Tm
0.061 Tw
[(your application must be freely redistrib)20(utable in)]TJ
T*
0 Tw
(source form, or)Tj
8 0 0 8 328.2 559.2 Tm
(\203)Tj
10 0 0 10 348.2 559.2 Tm
(you must get a license from us.)Tj
-2.5 -1.62 TD
0.013 Tw
[(F)15(o)0(r)-263(customers who prefer not to distrib)20(ute Open Source)]TJ
0 -1.2 TD
0.149 Tw
[(products, we sell licenses to use and e)15(xtend Berk)10(ele)15(y)]TJ
T*
0 Tw
(DB at a reasonable cost.)Tj
0 -1.62 TD
0.108 Tw
[(W)80(e)80( )-79(w)10(ork hard to accommodate the needs of the Open)]TJ
0 -1.2 TD
0.06 Tw
[(Source community)65(.)-561(F)15(or e)15(xample, we ha)20(v)15(e)15( )-15(crafted spe-)]TJ
T*
0.141 Tw
(cial licensing arrangements with Gnome to encourage)Tj
T*
0 Tw
[(its use and distrib)20(ution of Berk)10(ele)15(y)-250(DB.)]TJ
0 -1.62 TD
0.16 Tw
[(Berk)10(ele)15(y)-410(D)0(B)-410(conforms to the Open Source de\214nition)]TJ
0 -1.2 TD
0.237 Tw
[([Open99]. )-250(The)-487(license has been carefully crafted to)]TJ
T*
0.064 Tw
[(k)10(eep the product a)20(v)25(ailable as an Open Source of)25(fering,)]TJ
T*
0 Tw
[(while pro)15(viding enough of a return on our in)40(v)15(estment to)]TJ
T*
0.155 Tw
[(fund continued de)25(v)15(elopment and support of the prod-)]TJ
T*
0.053 Tw
[(uct. )-250(The)-303(current license has created a b)20(usiness capable)]TJ
T*
0.092 Tw
[(of funding three years of de)25(v)15(elopment on the softw)10(are)]TJ
T*
0 Tw
[(that simply w)10(ould not ha)20(v)15(e)15( )-15(happened otherwise.)]TJ
/F8 1 Tf
12 0 0 12 323.2 336.6 Tm
0.25 Tw
(7. Summary)Tj
/F6 1 Tf
10 0 0 10 323.2 320.4 Tm
0.049 Tw
[(Berk)10(ele)15(y)-299(D)0(B)-299(o)0(f)25(fers a unique collection of features, tar)20(-)]TJ
T*
0.017 Tw
[(geted squarely at softw)10(are de)25(v)15(elopers who need simple,)]TJ
T*
0.049 Tw
(reliable database management services in their applica-)Tj
T*
0.28 Tw
[(tions. )-250(Good)-530(design and implementation and careful)]TJ
T*
0.163 Tw
[(engineering throughout mak)10(e)-413(the softw)10(are better than)]TJ
T*
0 Tw
[(man)15(y)-250(other systems.)]TJ
0 -1.62 TD
0.16 Tw
[(Berk)10(ele)15(y)-410(D)0(B)-410(i)0(s)-410(a)0(n)-410(Open Source product, a)20(v)25(ailable at)]TJ
/F4 1 Tf
0 -1.2 TD
[(www)74(.sleepycat.com)]TJ
/F6 1 Tf
8.1289 0 TD
0.065 Tw
[(for do)25(wnload. )-250(The)-315(distrib)20(uted sys-)]TJ
-8.1289 -1.2 TD
0.038 Tw
[(tem includes e)25(v)15(erything needed to b)20(uild and deplo)10(y)-288(the)]TJ
T*
0 Tw
[(softw)10(are or to port it to ne)25(w)-250(systems.)]TJ
0 -1.62 TD
0.263 Tw
[(Sleep)10(ycat Softw)10(are distrib)20(utes Berk)10(ele)15(y)-513(D)0(B)-513(under a)]TJ
0 -1.2 TD
0.076 Tw
[(license agreement that dra)15(ws on both the UC Berk)10(ele)15(y)]TJ
T*
0.238 Tw
[(cop)10(yright and the GPL.)-738(The license guarantees that)]TJ
T*
0.088 Tw
[(Berk)10(ele)15(y)-338(D)0(B)-338(will remain an Open Source product and)]TJ
T*
0.149 Tw
[(pro)15(vides Sleep)10(ycat with opportunities to mak)10(e)-399(mone)15(y)]TJ
T*
0 Tw
[(to fund continued de)25(v)15(elopment on the softw)10(are.)]TJ
ET
endstream
endobj
30 0 obj
<<
/ProcSet [/PDF /Text ]
/Font <<
/F4 5 0 R
/F6 6 0 R
/F8 7 0 R
>>
/ExtGState <<
/GS1 8 0 R
>>
>>
endobj
32 0 obj
<<
/Length 3333
>>
stream
BT
/F8 1 Tf
12 0 0 12 79.2 708 Tm
0 g
/GS1 gs
0 Tc
0.25 Tw
[(8. Refer)18(ences)]TJ
/F6 1 Tf
10 0 0 10 79.2 691.8 Tm
([Come79])Tj
2.5 -1.2 TD
0.063 Tw
[(Comer)40(,)-313(D., \231The Ubiquitous B-tree,)70(\232)]TJ
/F4 1 Tf
15.2835 0 TD
[(A)30(C)0(M)-313(Com-)]TJ
-15.2835 -1.2 TD
0.04 Tw
[(puting Surve)30(ys)]TJ
/F6 1 Tf
6.2164 0 TD
[(V)129(olume 11, number 2, June 1979.)]TJ
-8.7164 -1.62 TD
([Gray93])Tj
2.5 -1.2 TD
0.048 Tw
[(Gray)65(,)-298(J., and Reuter)40(,)-298(A.,)]TJ
/F4 1 Tf
10.1053 0 TD
-0.21 Tw
[(T)55(r)55( ansaction )-258(Pr)45(ocessing:)]TJ
-10.1053 -1.2 TD
0.678 Tw
[(Concepts and T)92(e)0(c)15(hniques)]TJ
/F6 1 Tf
11.5246 0 TD
[(,)-928(Mor)18(gan-Kaufman)]TJ
-11.5246 -1.2 TD
0 Tw
(Publishers, 1993.)Tj
-2.5 -1.62 TD
([IEEE96])Tj
2.5 -1.2 TD
0.036 Tw
(Institute for Electrical and Electronics Engineers,)Tj
/F4 1 Tf
T*
0 Tw
(IEEE/ANSI Std 1003.1)Tj
/F6 1 Tf
9.0824 0 TD
[(,)-250(1996 Edition.)]TJ
-11.5824 -1.62 TD
([Litw80])Tj
2.5 -1.2 TD
0.236 Tw
[(Litwin, W)92(., \231Linear Hashing: A Ne)25(w)-487(T)80(ool for)]TJ
T*
0.178 Tw
[(File and T)80(able Addressing,)70(\232)]TJ
/F4 1 Tf
12.0887 0 TD
[(Pr)45(oceedings of the)]TJ
-12.0887 -1.2 TD
0.48 Tw
[(6th International Confer)37(ence on V)111(ery Lar)37(g)10(e)]TJ
T*
0.198 Tw
(Databases \(VLDB\))Tj
/F6 1 Tf
7.8358 0 TD
[(,)-448(Montreal, Quebec, Canada,)]TJ
-7.8358 -1.2 TD
0 Tw
(October 1980.)Tj
-2.5 -1.62 TD
([Open94])Tj
2.5 -1.2 TD
0.407 Tw
(The Open Group,)Tj
/F4 1 Tf
8.496 0 TD
[(Distrib)20(uted TP: The XA+)]TJ
-8.496 -1.2 TD
0.078 Tw
[(Speci\214cation, V)111(e)0(r)10(sion 2)]TJ
/F6 1 Tf
9.5614 0 TD
[(,)-328(The Open Group, 1994.)]TJ
-12.0614 -1.62 TD
([Open99])Tj
2.5 -1.2 TD
0.831 Tw
[(Opensource.or)18(g, \231Open Source De\214nition,)70(\232)]TJ
/F4 1 Tf
T*
[(www)74(.opensour)37(ce)15(.or)37(g/osd.html)]TJ
/F6 1 Tf
12.0313 0 TD
0.063 Tw
[(,)-313(v)15(ersion 1.4, 1999.)]TJ
-14.5313 -1.62 TD
([Raym98])Tj
2.5 -1.2 TD
0.072 Tw
[(Raymond, E.S., \231The Cathedral and the Bazaar)40(,)70(\232)]TJ
/F4 1 Tf
T*
[(www)74(.tuxedo.or)37(g/~esr/writings/cathedr)15(al-)]TJ
T*
[(bazaar/cathedr)15(al-bazaar)111(.html)]TJ
/F6 1 Tf
11.9013 0 TD
0 Tw
[(,)-250(January 1998.)]TJ
-14.4013 -1.62 TD
([Selt91])Tj
2.5 -1.2 TD
0.008 Tw
[(Seltzer)40(,)-258(M., and Y)55(igit, O., \231)80(A)-258(Ne)25(w)25( )-25(Hashing P)15(ack-)]TJ
T*
0.67 Tw
[(age for UNIX,)70(\232)]TJ
/F4 1 Tf
8.4378 0 TD
[(Pr)45(oceedings 1991 W)55(inter)]TJ
-8.4378 -1.2 TD
0 Tw
[(USENIX Confer)37(ence)]TJ
/F6 1 Tf
8.2665 0 TD
[(,)-250(Dallas, TX, January 1991.)]TJ
-10.7665 -1.62 TD
([Selt92])Tj
2.5 -1.2 TD
0.286 Tw
[(Seltzer)40(,)-536(M., and Olson, M., \231LIBTP: Portable)]TJ
T*
0.284 Tw
[(Modular T)35(ransactions for UNIX,)70(\232)]TJ
/F4 1 Tf
14.9452 0 TD
[(Pr)45(oceedings)]TJ
-14.9452 -1.2 TD
0.149 Tw
[(1992 W)55(inter Usenix Confer)37(ence)]TJ
/F6 1 Tf
13.2132 0 TD
[(,)-399(San Francisco,)]TJ
-13.2132 -1.2 TD
0 Tw
(CA, January 1992.)Tj
-2.5 -1.62 TD
([Ston82])Tj
2.5 -1.2 TD
0.754 Tw
[(Stonebrak)10(er)40(,)-1004(M., Stettner)40(,)-1004(H., Kalash, J.,)]TJ
T*
0.076 Tw
[(Guttman, A., and L)55(ynn, N., \231Document Process-)]TJ
T*
0.056 Tw
[(ing in a Relational Database System,)70(\232)-306(Memoran-)]TJ
T*
0.082 Tw
[(dum No. UCB/ERL M82/32, Uni)25(v)15(ersity of Cali-)]TJ
T*
0 Tw
[(fornia at Berk)10(ele)15(y)65(,)65( )-65(Berk)10(ele)15(y)65(,)65( )-65(CA, May 1982.)]TJ
ET
endstream
endobj
33 0 obj
<<
/ProcSet [/PDF /Text ]
/Font <<
/F4 5 0 R
/F6 6 0 R
/F8 7 0 R
>>
/ExtGState <<
/GS1 8 0 R
>>
>>
endobj
34 0 obj
<<
/Type /Halftone
/HalftoneType 1
/HalftoneName (Default)
/Frequency 60
/Angle 45
/SpotFunction /Round
>>
endobj
8 0 obj
<<
/Type /ExtGState
/SA false
/OP false
/HT /Default
>>
endobj
4 0 obj
<<
/Type /Font
/Subtype /Type1
/Name /F2
/Encoding 35 0 R
/BaseFont /Courier
>>
endobj
5 0 obj
<<
/Type /Font
/Subtype /Type1
/Name /F4
/Encoding 35 0 R
/BaseFont /Times-Italic
>>
endobj
6 0 obj
<<
/Type /Font
/Subtype /Type1
/Name /F6
/Encoding 35 0 R
/BaseFont /Times-Roman
>>
endobj
7 0 obj
<<
/Type /Font
/Subtype /Type1
/Name /F8
/Encoding 35 0 R
/BaseFont /Times-Bold
>>
endobj
35 0 obj
<<
/Type /Encoding
/Differences [ 0/asciicircum/asciitilde/Scaron/Zcaron/scaron/zcaron/Ydieresis/trademark
/quotesingle 94/circumflex 126/tilde 128/quotesinglbase/guillemotleft/guillemotright/bullet/florin
/fraction/perthousand/dagger/daggerdbl/endash/emdash/ff/fi
/fl/ffi/ffl/dotlessi/dotlessj/grave/hungarumlaut/dotaccent
/breve/caron/ring/ogonek/quotedblleft/quotedblright/oe/lslash
/quotedblbase/OE/Lslash 164/currency 166/brokenbar 168/dieresis/copyright/ordfeminine
/guilsinglleft/logicalnot/minus/registered/macron/degree/plusminus/twosuperior
/threesuperior/acute/mu 183/periodcentered/cedilla/onesuperior/ordmasculine/guilsinglright
/onequarter/onehalf/threequarters 192/Agrave/Aacute/Acircumflex/Atilde/Adieresis
/Aring/AE/Ccedilla/Egrave/Eacute/Ecircumflex/Edieresis/Igrave
/Iacute/Icircumflex/Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex
/Otilde/Odieresis/multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis
/Yacute/Thorn/germandbls/agrave/aacute/acircumflex/atilde/adieresis
/aring/ae/ccedilla/egrave/eacute/ecircumflex/edieresis/igrave
/iacute/icircumflex/idieresis/eth/ntilde/ograve/oacute/ocircumflex
/otilde/odieresis/divide/oslash/ugrave/uacute/ucircumflex/udieresis
/yacute/thorn/ydieresis
]
>>
endobj
1 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 3 0 R
/Contents 2 0 R
>>
endobj
10 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 12 0 R
/Contents 11 0 R
>>
endobj
13 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 15 0 R
/Contents 14 0 R
>>
endobj
16 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 18 0 R
/Contents 17 0 R
>>
endobj
19 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 21 0 R
/Contents 20 0 R
>>
endobj
22 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 24 0 R
/Contents 23 0 R
>>
endobj
25 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 27 0 R
/Contents 26 0 R
>>
endobj
28 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 30 0 R
/Contents 29 0 R
>>
endobj
31 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 33 0 R
/Contents 32 0 R
>>
endobj
9 0 obj
<<
/Type /Pages
/Kids [1 0 R 10 0 R 13 0 R 16 0 R 19 0 R 22 0 R 25 0 R 28 0 R 31 0 R]
/Count 9
/MediaBox [0 0 612 792]
>>
endobj
36 0 obj
<<
/Type /Catalog
/Pages 9 0 R
>>
endobj
37 0 obj
<<
/CreationDate (D:19990512125327)
/Producer (Acrobat Distiller 3.01 for Power Macintosh)
/Creator (groff version 1.11)
>>
endobj
xref
0 38
0000000000 65535 f
0000070835 00000 n
0000000016 00000 n
0000006981 00000 n
0000069211 00000 n
0000069306 00000 n
0000069406 00000 n
0000069505 00000 n
0000069140 00000 n
0000071579 00000 n
0000070915 00000 n
0000007105 00000 n
0000015158 00000 n
0000070998 00000 n
0000015263 00000 n
0000023664 00000 n
0000071081 00000 n
0000023789 00000 n
0000031655 00000 n
0000071164 00000 n
0000031760 00000 n
0000039981 00000 n
0000071247 00000 n
0000040096 00000 n
0000048710 00000 n
0000071330 00000 n
0000048825 00000 n
0000057240 00000 n
0000071413 00000 n
0000057365 00000 n
0000065401 00000 n
0000071496 00000 n
0000065516 00000 n
0000068902 00000 n
0000069017 00000 n
0000069603 00000 n
0000071716 00000 n
0000071766 00000 n
trailer
<<
/Size 38
/Root 36 0 R
/Info 37 0 R
/ID [<33a8fd20984732815d705c6403f375d5><33a8fd20984732815d705c6403f375d5>]
>>
startxref
71906
%%EOF
9 0 obj
<<
/Type /Pages
/Kids [ 38 0 R 1 0 R 10 0 R 13 0 R 16 0 R 19 0 R 22 0 R 25 0 R 28 0 R 31 0 R
]
/Count 10
/MediaBox [ 0 0 612 792 ]
>>
endobj
37 0 obj
<<
/CreationDate (D:19990512125327)
/Producer (Acrobat Distiller 3.01 for Power Macintosh)
/Creator (groff version 1.11)
/ModDate (D:19990512125419)
>>
endobj
38 0 obj
<<
/Type /Page
/Parent 9 0 R
/Resources 40 0 R
/Contents 39 0 R
>>
endobj
39 0 obj
<< /Length 2136 /Filter /LZWDecode >>
stream
�pi� ��c(�@E,F"H�2�EÁ��l6�#����G#1�zA"�C(��g.��cY��]6�
�#A����P���n6%���'K���51uV&6�dCi�J1T�PFS���08��(#t~Z2.�K��� ��-s�e��v�
s�v����qc,m�1�`"��dv�.�T,�[��k��kyM.��C�����t~�O�
'#1�B?!
G8��>�/���1��:_#�Ԍ�q��k-w�l�e�b����2m�~�;�v��$COX�w�
n����&��.��`��)�b�?0+�#A��>�
�8jZ�+,����<�s��#��'�B���R��:��j��!���)��E��t��,�c%rk"�P�:�;�[��j8M������*��ۆ(�y0LI���A�57H��&�'2�J��R�>����9��Л�S\[@�����L������,�B����ҫ6����ۆ�,�Ĉ�ME�t�=+�z�²0�pn�J2#�@GYV�P\P-]�¾R�:���
x������Y�@sE�M��Z�ޢ%�
!\@�]1d�е�aX��r���qR�o�V^h�����b±���gg��b�=�i+�.�Kx��ܺ����/���HQ��� ��:�+n�d�
±���Y�G-�� 5J�pŨ�$��*�@�O�֊��T�Sg��ڛ��\͛�X|ܽ<��ͣi�θ�5{�����~���ҭ�a��h3 j͙�Nz��f�,};�uc�@�����:�6n�}�ᝑqq��CȰ��MrP, ��n�@�?C�t��q�nٻ(��Y��,K���&տR�ٺm����b�S�Z뮲8@9�o��A��b�:v�R�6��͛����z�&�_�z��a0v�zţ껃a�m�B����h
�aGE�������8)�������Qip����K$&4��&&Q_S�|ʩ�;(F����L-�U㜸��7=$m�7�ZoOj(.�C'.���t7������Vm��ig�b� �F�|��x���0dt��w��
�z�6�&*�T��#�&�bz�܁$_�t�|�� 9
�Ap
�H�� �@�BpC�0�B�U
�$'p@��S
�&�9Oa$ ɰ��HT
Q�#��t˂]�K�JU�Z��AH9
dL3��!! h��3��xw
!�3��C�e@�;�Ði�l0�p�7��b
��9� �Ep �JX�8�Ge��D2^!؞L�ɵX����pP�oa�2�I�3Ao��i��(E �$��@�CkS�PIv!�L6�0���CdI��źB�"��07��;AT��4���@C
�3N ���,��ڤ˅�K��'����w���<���6��J�Ad����j�*�k�@5�%P�4��\rH�|ia1���>��] �8��
q��,��
�jWh���7�@��Y,�@�&���l����'���T�K��}��Øt���0���Cr�O�p��,#J�^]5�M�y��H�Ri�S��8��$����W�Vu�����)yX��U������!������1N��4�B �9�p�C
ªt��9":]�0C�$"Hp�5�R�A�:�C���V��f���-d�M�L;wc��'����D�!�2m�`�C%G
�zpN"�ha�s���94�u�s���9��2�h���P�&�y�V�(N �=�i�dP�zN�t����7U0��hm�X>v������$� n�
���
#�C�4k+�I��\����� �R2���7&�O�.(�sFz�ygN@���Q2Ln�sl1���(��ӺWK�M�7�r�2�H<�3������`�FA�@kϡ��@ϊ��aғ쌢�Y�OBm�9f�1�Ǘ��P��m��N����Hb
�kN�ꇃ0v�T�jh}���h[������`
$v�
���\�A��!c�>C�
�7F�.�@��tF$xs�����CHl��h3T��6W
!�N��ɉ"�_�q�H�:�;��]�jwL
�����9�
����
endstream
endobj
40 0 obj
<<
/ProcSet [ /PDF /Text ]
/Font << /F2 46 0 R /F4 44 0 R /F7 43 0 R >>
/ExtGState << /GS1 42 0 R /GS2 41 0 R >>
>>
endobj
41 0 obj
<<
/Type /ExtGState
/SA true
/OP false
/HT /Default
>>
endobj
42 0 obj
<<
/Type /ExtGState
/SA false
/OP false
/HT /Default
>>
endobj
43 0 obj
<<
/Type /Font
/Subtype /Type1
/Name /F7
/Encoding /MacRomanEncoding
/BaseFont /Helvetica
>>
endobj
44 0 obj
<<
/Type /Font
/Subtype /Type1
/Name /F4
/Encoding 45 0 R
/BaseFont /Times-Italic
>>
endobj
45 0 obj
<<
/Type /Encoding
/Differences [ 9 /space 39 /quotesingle 96 /grave 128 /Adieresis /Aring /Ccedilla
/Eacute /Ntilde /Odieresis /Udieresis /aacute /agrave /acircumflex
/adieresis /atilde /aring /ccedilla /eacute /egrave /ecircumflex
/edieresis /iacute /igrave /icircumflex /idieresis /ntilde /oacute
/ograve /ocircumflex /odieresis /otilde /uacute /ugrave /ucircumflex
/udieresis /dagger /degree 164 /section /bullet /paragraph /germandbls
/registered /copyright /trademark /acute /dieresis /notequal /AE
/Oslash /infinity /plusminus /lessequal /greaterequal /yen /mu /partialdiff
/summation /product /pi /integral /ordfeminine /ordmasculine /Omega
/ae /oslash /questiondown /exclamdown /logicalnot /radical /florin
/approxequal /Delta /guillemotleft /guillemotright /ellipsis /space
/Agrave /Atilde /Otilde /OE /oe /endash /emdash /quotedblleft /quotedblright
/quoteleft /quoteright /divide /lozenge /ydieresis /Ydieresis /fraction
/currency /guilsinglleft /guilsinglright /fi /fl /daggerdbl /periodcentered
/quotesinglbase /quotedblbase /perthousand /Acircumflex /Ecircumflex
/Aacute /Edieresis /Egrave /Iacute /Icircumflex /Idieresis /Igrave
/Oacute /Ocircumflex /apple /Ograve /Uacute /Ucircumflex /Ugrave
246 /circumflex /tilde /macron /breve /dotaccent /ring /cedilla
/hungarumlaut /ogonek /caron ]
>>
endobj
46 0 obj
<<
/Type /Font
/Subtype /Type1
/Name /F2
/Encoding 45 0 R
/BaseFont /Times-Roman
>>
endobj
xref
0 1
0000000000 65535 f
9 1
0000072822 00000 n
37 10
0000072978 00000 n
0000073148 00000 n
0000073237 00000 n
0000075446 00000 n
0000075583 00000 n
0000075660 00000 n
0000075738 00000 n
0000075854 00000 n
0000075962 00000 n
0000077308 00000 n
trailer
<<
/Size 47
/Info 37 0 R
/Root 36 0 R
/Prev 71906
/ID[<33a8fd20984732815d705c6403f375d5><25eefe413fa44dff24172f0b12fcc88f>]
>>
startxref
77415
%%EOF