Breakthrough!
Jeeya S -
Hello fellow classmates! I have some exciting updates. As you know, I had begun to find ways to better filter, manage and edit mass data in MARC records using OpenRefine. I also used Excel in this process. Now, I am excited to say I have found a breakthrough! One of the key issues I had hoped to address in the initial stages of this project was the mass correction of common spelling errors in the 520 field of a MARC record. The 520 field in a MARC record is a summary, abstract, annotation, or review of the material (book, audiobook, CD, ect..).
Here is a quick overview of the process: We will first use MARCEdit to turn the mrc file into a format OpenRefine can support (json). Then, after importing the file into OpenRefine, create a project and filter out the 520 fields (and other edits you may want to do). Then we export the file into an Excel to use the spell check feature to fix spelling errors. (Important note: OpenRefine also points out spelling errors, but only if you click on each individual cell. Excel jumps through cells in a column faster.). Once the Excel file has been edited, we re-import that excel into OpenRefine (though it makes a separate project). Then we merge into the original file. Now, with the fully edited original file, we re-import into MARC edit and create a mrc file.
Some difficulties I experienced along the way:
GREL: OpenRefine is generally a user-friendly tool, however some of the more complicated commands in there (such as when re-merging the two columns of separate projects) required GREL (General Refine Expression Language). GREL is used to perform complex data transformations, queries, and arrangements within OpenRefine. Luckly for me, GREL is designed to resemble Javascript (which I am familiar with, with formulas using variables and depending on data types to do things like string manipulation or mathematical calculations). Also, there are only two instances where I needed to use it, and they were not major (UIUC LibGuides were very helpful in searching up specific commands).
Lack of access to Excel: Because I cannot sign into my personal Microsoft account on the library PC I am using, I used Google sheets.
Tune in next time! Thank you (I attached an image of a record set on OpenRefine if you were curious)
Comments:
All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.