Analysis

The analysis for this paper was completed in Excel. I was planning on cleaning some data in Excel and using R-Studio to continue my analysis. Yet, as the tools within Excel were sufficient for understanding our data, the process stayed in Excel. The power of using multiple tools to derive the most robust conclusions is the goal here, not forcing an analysis to take place if we determine it is not the best way to handle our goals. Referencing “Data Modeling” by Crawford, “Text Analysis and Visualization” by Jockers and Underwood and finally, “Lynching, Visualization and Visibility” by Mullen, we try and turn personal stories into a visual project, telling a grand story that relates all stories to a macro picture of Lewiston-Auburn during the mill era. The specific excel file was not included but all related findings are included in the following images. 

Highest Level of Education Attained
Number of Siblings within each three groups of subjects
Under each of the three subdivisions of data, we express whether a person was the first to work in the mills.
Language spoken at home, depending on where your parents were born.

The above images create a story of the Lewiston-Auburn area during the mill era. We chose to break down each of our three groups (U.S./Canada, U.S., and Canada) into the four groups seen above. We find that if your parents are both born in the U.S., there is a higher probability that you will finish High School and individuals with both parents born in Canada, there remains a high chance you will graduate from college. If your parents are both born in the U.S., you may have fewer siblings. We would expect 4 children to be born in each U.S. family. Further, we know that there were no mills in Canada, so we would expect that if your parents were both born in Canada, you may be the first in your family to work in the mills. We see this as there effect taking place in the above analysis. And finally, we find that language spoken at home remains most intuitive. If both parents are born in the U.S. there is a 93% chance English is spoken at home. Whereas with some forms of Canadian parents, French is spoken more often is these households. What remains interesting, however, is that French is spoken in 7% of U.S. born households, meaning a person with U.S. parents will return home to a French-speaking household. The French language never died off in these homes. 

In conclusion, it remains interesting to see the personal stories of so many people used to tell a general picture of Lewiston-Auburn. It would be awesome to see how these findings relate to other regions during this time period. Do these findings align with what would find in other regions or even counties during the early 1900s? The people at the LA Museum were very interested in this question. Maybe we can explore this at a later date! 

   

Ethical Concerns Addressed

Anytime you are given access to personal data, proper methods should be undertaken to limit the possibility of “doing-something-bad” with this information. In the age of data, firms such as Facebook are trusted with the personal data of over 2 billion people, and they should act in the most ethical ways during this process. Selling data or exposing private information should be limited at all costs. In the process of this project, the personal data of fellow Mainers was collected. This included very personal information such as where a person lived, the language they spoke,  and their birthdates. This information was shared willingly by each individual but there remains an agreement that this information would not be publically shared or “thrown” on the internet for millions of people to see. Whether a person is deceased or still living in the Lewiston region should not affect the ethical concerns of using this data. Papers such as, “What you can, can’t and shouldn’t do with social media data”, by Tatman express some of these concerns. A tweet by John Abowd also expressed these concerns relating to major tech companies, yet the same ethics should be carried out when handling our smaller data sets. And additionally, if we are looking at personal data as seen in, “Becoming Null”, by (Date and Darwen), we continue to raise ethical questions relating to big tech. All data needs to be respected. 

Context for Reader

During the mid-1800s and following for nearly 100 years, the landscape of Lewiston-Auburn (LA), Maine was thriving as a mill town. The Androscoggin River of the LA region has the steepest pitch of water flow in the Northeast. Mills manufacturing shoes, textiles, and bricks were compacted alongside the river, employing thousands of workers. This part of LA history impacted the lives of so many mothers, fathers, and children, each with a unique story to tell. The data used in this project was collected from hundreds of interviews taken place with ex-mill workers, telling their stories of pain, experiences and how their personal story falls within the long scope of Maine history. Once these interviews were completed, it was the process of sorting this information into spreadsheets, which would allow us to study all the connections between each unique story. For this project, each interviewee told us where their mother and father were born, which allowed all following analysis to take place. Looking at categories of siblings, language, first-in-family and eduction allowed a general story to be told. It remains equally interesting to understand how the people of LA were educated than the specifics of one person’s experience. Depicting a general picture of LA during this period in history expands our knowledge of the area, and brings us closer with the city that hosts Bates College. Thomas Padilla in the video “Collections as Data” explains the importance of search for these incites and why we all need to share this information with one another to create “a life worth living”. 

How data was produced.

I used available data provided in various CSV files that included information about families provided by our own Lewiston Museum. Once it was determined that the country of birth for both mother and father could be accessed for a majority of our subjects, I broke this information down into three groups: parents born in separate countries (U.S. and Canada), both parents born in the U.S. and both parents born in Canada. The four areas in the analysis would include: the number of siblings, language spoken at home, the highest level of education attained and if that person was first to work in the mills. The cleaning of data was conducted to strip out unnecessary cells and create our three distinct test subject groups. Additional variables were all converted to numerical values so charts and other forms of study could be carried out. Papers for reference include: “Text Analysis and Visualization” (Jockers) and “Sentiment Analysis and Subjectivity” (Liu). As the data set was rather small (<100 subjects), I was able to look over each cell, ensuring our group of subjects was cleaned properly, yielding conclusive results. We can think of our subdivision of subjects into three groups (Canada/U.S., Canada, U.S.) as a Topic Modeling method, group subjects into “buckets of words”, rather them forcing them to act independently (“Digital Humanities Contribution to Topic Modeling”, Meeks and Weingart).     

Questions we want to answer.

Using data compiled from various sources, a semesters worth of work from many students unlocked many incites into the lives of the mill workers from Lewiston-Auburn. For this project, I wanted to extend upon this previous work and determine the effect parents origin held on the lives of these workers. Would there be a difference in certain factors depending on where a person’s parents were born? The following factors we wanted to look at are the following: the number of siblings, language spoken at home, the highest level of education attained and if that person was first to work in the mills.  

css.php