Abstract
Here's a project where you can try your hand at being a detective with your computer. In this project you'll write a program to do some basic analysis of features of written text (for example, counting the length of each word in the text, or the number of words in each sentence). Then you'll see if you can use the information from your text analysis program to find measurements that can distinguish one author from another. After analyzing known samples of several authors' writings, can your method match up unidentified writing samples with their correct authors?Objective
The goal of this project is to write a computer program to make some simple measurements on a block of text, and then to see if this information can be used to identify the author of the text.
Introduction
Your English teacher has probably told you that every author has an individual writing style—their own unique 'voice' on the page. Is it possible to find ways to identify that voice through computer analysis of written text?
A familiar case from history argues that it is indeed possible. When our forefathers, newly independent from Great Britain, were debating whether to do away with the Articles of Confederation and adopt the new Constitution written by a convention in Philadelphia, a series of essays was written to argue in favor of adopting the new government. These essays, now called The Federalist Papers, were signed "Publius," but are now attributed to Alexander Hamilton, James Madison, and John Jay. The authorship of 12 of the essays was claimed by both Hamilton and Madison. As Julie Rehmeyer writes in a recent Science News article (Rehmeyer, 2007): "Altogether, researchers have considered more than 1,000 features of writing style. Nearly all the analyses have vindicated Madison."
Relax, you won't need to analyze 1,000 different features for your science fair project. The Science Buddies project, Paragraph Stats: Writing a JavaScript Program to 'Measure' Text, shows you how to write a simple program to measure:
Terms, Concepts and Questions to Start Background Research
To do this project, you should do research that enables you to understand the following terms and concepts:
Questions
Bibliography
Materials and Equipment
To do this experiment you will need the following materials and equipment:
Experimental Procedure
Variations
Credits
Andrew Olson, Ph.D., Science Buddies
Last edit date: 2007-03-23 12:00:00
If you like this project, you might enjoy exploring careers in Computer Science.
![]() |
Computer Programmer Computers are essential tools in the modern world, handling everything from traffic control, car welding, movie animation, shipping, aircraft design, and social networking to book publishing, business management, music mixing, health care, agriculture, and online shopping. Computer programmers are the people who write the instructions that tell computers what to do. |
![]() |
Computer Software Engineer Are you interested in developing cool video game software for computers? Would you like to learn how to make software run faster and more reliably on different kinds of computers and operating systems? Do you like to apply your computer science skills to solve problems? If so, then you might be interested in the career of a computer software engineer. | |
![]() |
Network Systems and Data Communications Analyst Computers are an important part of our lives. We use computers to hold and process data, to control manufacturing factories, and to surf the Internet. We are all part of many different kinds of computer networks that are continually sharing information. The role of the network systems and data communications analyst is to design, model, and evaluate computer networks so that they can share information seamlessly. This is an exciting career for those people who enjoy working with rapidly changing technology. |
![]() |
Software Quality Assurance Engineer and Tester Software quality assurance engineers and testers oversee the quality of a piece of software's development over its entire life cycle. Their goal is to see to it that the final product meets the customer's requirements and expectations in both performance and value. During the software life cycle, they verify (officially state) that it is possible for the software to accomplish certain tasks. They detect problems that exist in the process of developing the software, or in the product itself. They try and make things not work (try to "break" the software) by creating errors or combinations of errors that a user might make. For example, if a user enters a period or a pound sign for a password, will that break the software? They seek to anticipate potential issues with the software before they become visible. At the end of the life cycle, they reflect upon how problems or bugs arose, and figure out ways to make the software development process better in the future. | |
![]() |
Computer Hardware Engineer Whether you are playing video games, surfing the Internet, or writing a term paper, computers are an integral part of our daily lives. Computer hardware engineers work to make computers faster, more robust, and more cost-effective. They design the microprocessor chips that make your computer function, along with the equipment that makes computing easy and fun to do. |
![]() |
Database Administrator Databases are collections of similar records, like the products a company sells, information on all people with a driver's license for a state, or the medical records in a hospital. Database administrators have the important job of figuring out how to organize, access, store, search, cross-reference, and protect all those records. Their services are needed by law enforcement, government agencies, and every type of business imaginable. Management of large databases is also critical for scientific research, including understanding and developing cures for diseases. | |
|
Join Science Buddies
Become a Science Buddies member! It's free! As a member you will be the first to receive our new and innovative project ideas, news about upcoming science competitions, science fair tips, and information on other science related initiatives. |