It's time to get things started on an OpenSource project for the cause.
Please PM me (use DU or email at bri at abrij dot org) if you are interested in helping and have any of the below. Especially needed right now is #1 on the list.
- Experience in setup and maintainance of projects on an opensource development server (SourceForge, Savannah, etc.)
- Perl at moderate to advanced skill levels.
- Knowlege or special interest in the internal workings of OpenSource OCR software like gocr.
- Knowlege or special interest in the internal workings of OpenSource format extraction tools like pdftotext, word2x, etc.
- Knowlege or special interest in the internal workings of web spidering packages like wget, curl, etc.
- Any other skills you think might help in a support role, e.g. distribution packaging, project promo and website, expert knowlege.
So far I've been working with USCountVotes and giving them some advice about their database design. To wit, I've helped them anticipate several of the ickier "gotchas" that a database must address to deal with the slipshod nature of election return data. I've started writing preliminary code for parsing -- those for ASCII versions of two diebold reporting formats are near completion. Also I've started organizing a list of the various formats we need to deal with here:
http://abrij.org/~bri/my2c/boefmts.htmlThis software project will be working very closely with USCountVotes and will primarily be focused on helping them automate data aquisition on a large scale. The project will be run/administered independently though, as it keeps it out of their hair and has a few parrallel secondary objectives, like making the software available to BOEs to encourage them to produce faster and more thorough returns, and for use by campaigns trying to judge the effectiveness of their GOTV drives.
USCountVotes does have an email list, and if you want to join up there you can/should too. This project will likely have a separate mailing list, as the USCountVotes one will be busy with posts for immediately needed data and overall database design, while this project will be focusing explicitly on data extraction/mining.
:kick me: