compared with
Current by Vladimir Alexiev
on Feb 18, 2013 13:22.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (3)

View Page History
{toc}

h12. Old
[^rs-kbgen-times.xlsx].
The 1st dataset is with 6k objects, the 2nd is with 44k objects (lots of sameAs though, so actually half of it show as Museum objects in the UI).
* The speed of adding objects is: 700/min vs. 1300/min for the larger dataset; this also includes the overhead of parsing

h2. New Mapping
[^BM-loading-time.xls]
{viewxls:BM-loading-time.xls}
- the 115k repo uses the new objects, but old thesauri/images files
- FTS indexing is quite fast. But FTS size is still too large

h2. Full Set
See [BM Data Volumetrics#Full Set]
- Storage location was on a RAM drive. Took 55G out of 64G. Using a RAM drive for repo load is times faster according to previous experiments on other servers
- storage size: 50+GB
- adding BM objects: start speed 132 obj/s, end speed: 26 obj/s. Approx ~20h total time
- ~2M BM bjects according to nuxeo ID file (not 1.5M as we said before)
- (?) lots of DBPedia thesauri items, w/o label; don't know where those came from
- ~407,000 thesauri items, indexed in 3min
- nuxeo ids added in 2600s (<1h)
- failed w/ Exception during Rembrandt paintings