I added a number of tests that measure performance as file size increases for CSV files. Currently, performance degrades quickly as the file size gets larger. For example, I got the following numbers by running the test on the machine described in the environment section:
200k - 58.3 kb/s
400k - 36.3 kb/s
600k - 26.8 kb/s
800k - 20.3 kb/s
1m - 15.0 kb/s
5m - 1.7 kb/s
The tests have been added to the daffodil-perf/src/test/.../csv/TestCSV.scala
The data files are located in a different repository on Tresys's network:
svn+ssh://username@repos/repos/svn/ngf-dfdl/Input/csv
- is triggered by
-
DFDL-976 Performance: XPath expressions are slow
- Closed