0% found this document useful (0 votes)

19 views

AntCorGen Manual

AntCorGen is a freeware corpus generation tool developed by Laurence Anthony that allows users to search and download documents from the PLOS ONE research database, analyze parts of speech, and cluster similar sentences. It is compatible with Windows, MacOS, and Linux, and requires no installation for use. Users can create and manage corpus collections and clustering processes through a series of guided steps within the application.

Uploaded by

psychoseebu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

AntCorGen Manual

Uploaded by

psychoseebu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

AntCorGen (Windows, MacOS, Linux)

Build 1.3.0
Laurence Anthony, Ph.D.
Center for English Language Education in Science and Engineering, School of Science and
Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
February 2, 2024

Introduction
AntCorGen is a freeware corpus generation tool. AntCorGen lets you search for documents in the PLOS ONE
research database via search queries and/or subject category browsing and decide which sections (e.g. title,
abstract, introduction) of these documents should be stored. AntCorGen then accesses the database, downloads
the sections, and saves each one as a text file in an appropriate folder. AntCorGen can also analyze the different
parts of speech (e.g. adjectives, verbs) of words in the files and cluster similar sentences into sub-groups. These
sub-groups will show similar patterns of language use.

AntCorGen runs on any computer running Microsoft Windows (tested on Win 7), Macintosh OS X (tested on OS
X 10.9 Mavericks), and Linux (tested on Linux Mint 17) computers. It is developed in Python and Qt using the
PyInstaller compiler to generate executables for the different operating systems.

Getting Started (No installation necessary)

Windows - Installer
Double click the AntCorGen.exe file and follow the instructions to install the application into your Programs folder.
You can delete the .exe file when you are finished. You can start the application via the Start Menu.

Windows - Portable
Unzip the AntCorGen.zip file into a folder of your choice. In the AntCorGen folder, double click the AntCorGen.exe
file to launch the program.

Macintosh OS X
Double click the AntCorGen.dmg file to create a AntCorGen disk image on your desktop. Open the disk image
and drag and drop the AntCorGen app onto the Applications folder (or into another location if you desire). You
can then launch the app by double clicking on the icon in the Applications folder or the Launchpad.

Linux
Decompress the AntCorGen.tar.gz file into a folder of your choice. In the AntCorGen folder, double click the
AntCorGen.sh file to launch the software. On the command line, type ./AntCorGen.sh to launch the software.
Text Collection - Quick Guide
Step 1: Select a corpus storage folder into which the corpus files will be
saved using the "Choose" button.

Step 2: Choose documents to be included in the corpus collection.

Option A: Search for relevant documents using the "Query Viewer/Editor"
and/or "Query" settings:
1) The "Query Viewer/Editor" will show a complete query in the
Solr search query language used by PLOS ONE. More
information about the query language can be found at the following links:
• [Tutorial] http://www.solrtutorial.com/solr-query-syntax.html
• [Examples] http://api.plos.org/solr/examples/
• [Main Solr site] http://lucene.apache.org/solr/
2) The "Query" tool will allow you build a query using field names, queries items, and AND/OR/NOT
operations. To add/delete parts to the query, using the -/+ buttons. All changes made in the "Query"
tool will be reflected in the "Query Viewer/Editor".
Option B: Browse for relevant documents using the "PLOS ONE Categories" browser tree:
1) Click on category branches in the browser tree to expand the branches and show sub-categories. The
number of documents in each category is shown in parentheses.
2) Select categories to be included in the collection. The total number of documents within the selected
categories is shown in the browser tree header.

Step 3: Decide whether or not to set a maximum number of corpus files to collect using the "Max hits" checkbox
option and spinbox values widgets.

Step 4: Click the "Find Hits" button to show an estimate of the total number of documents ("Total Hits") that will
be collected. The result is shown in the status window.

Step 5: Click the "Create Corpus" button to collect the documents and store them in the corpus storage folder.
The total number of documents (hits) will be updated to show how many have been collected. The id and status
of the collection for each document is shown in the status window.

Step 6: Click the "Stop" button to stop the collection process at any time.

Text Clustering - Quick Guide

Step 1: Select a "source folder" of corpus files that you want to cluster using
the "Choose" button.

Step 2: Select a "target folder" into which the clustered files will be saved
using the "Choose" button.

Step 3: Choose features that you want to include as part of the clustering
algorithm.

Step 4: Choose parameters (max number of features, min frequency of features, number of clusters)

license.docx
that you want to use in the clustering algorithm. To use all the possible features, set the "Max
features" option to -1 (the default). If you are not sure how many clusters to pick, use the "Auto Detect" option.
Step 5: Click the "Preview" button to show a scatter-plot visualization of the clusters. If the clusters are not
separated in the visualization, adjust the features and parameters as necessary. The scatter-plot can be resized,
zoomed, label-adjusted, and saved using the icons above the plot image.
Step 6: Click the "Create Clusters" button to cluster the document sentences and store them in the target folder.

Step 7: Click the "Stop" button to stop the clustering process at any time.

NOTES
Comments/Suggestions/Bug Fixes
All new editions and bug fixes are listed in the revision history below. However, if you find a bug in the program,
or have any suggestions for improving the program, please let me know and I will try to address the issues in a
future version.

This software is available as 'freeware' (see Legal Matter below), but it is important for my funding to hear about
any successes that people have with the software. Therefore, if you find the software useful, please send me an
e-mail briefly describing how it is being used.

CITING/REFERENCING AntCorGen
Use the following method to cite/reference AntCorGen according to the APA style guide:

Anthony, L. (YEAR OF RELEASE). AntCorGen (Version VERSION NUMBER) [Computer Software]. Tokyo, Japan:
Waseda University. Available from http://www.antlab.sci.waseda.ac.jp/

For example if you download AntCorGen 0.0.1 which was released in 2017, you would cite/reference it as follows:
Anthony, L. (2017). AntCorGen (Version 1.0.0) [Computer Software]. Tokyo, Japan: Waseda University. Available
from http://www.antlab.sci.waseda.ac.jp/

KNOWN ISSUES
None at present.

Python: Learn Python in 24 Hours
From Everand
Python: Learn Python in 24 Hours
Alex Nordeen
4/5 (12)
Javascript: Javascript Programming For Absolute Beginners: Ultimate Guide To Javascript Coding, Javascript Programs And Javascript Language
From Everand
Javascript: Javascript Programming For Absolute Beginners: Ultimate Guide To Javascript Coding, Javascript Programs And Javascript Language
William Sullivan
3.5/5 (2)
C# For Beginners: An Introduction to C# Programming with Tutorials and Hands-On Examples
From Everand
C# For Beginners: An Introduction to C# Programming with Tutorials and Hands-On Examples
Nathan Metzler
5/5 (1)
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
From Everand
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
Jason Scotts
4/5 (55)
C# for Beginners: Learn in 24 Hours
From Everand
C# for Beginners: Learn in 24 Hours
Alex Nordeen
No ratings yet
Learn Python in 10 Minutes
From Everand
Learn Python in 10 Minutes
Victor Ebai
4/5 (30)
Computer for Kids: The Operating System
From Everand
Computer for Kids: The Operating System
Steven Bright
No ratings yet
So You Want To Be an iOS Developer
From Everand
So You Want To Be an iOS Developer
Kent Franks
No ratings yet
AutoIT Scripting For Beginners
From Everand
AutoIT Scripting For Beginners
Rajan
5/5 (2)
Help
No ratings yet
Help
23 pages
Eclipse for Java Developers
From Everand
Eclipse for Java Developers
Dimitar Boyadzhiev
No ratings yet
Touchpad Play Ver 2.0 Class 7: Windows 10 & MS Office 2016
From Everand
Touchpad Play Ver 2.0 Class 7: Windows 10 & MS Office 2016
Team Orange
No ratings yet
Build a Whatsapp Like App in 24 Hours: Create a Cross-Platform Instant Messaging for Android
From Everand
Build a Whatsapp Like App in 24 Hours: Create a Cross-Platform Instant Messaging for Android
Arjun Subburaj
3.5/5 (5)
Python Made Easy: A First Course in Computer Programming using Python
From Everand
Python Made Easy: A First Course in Computer Programming using Python
Kevin Wilson
No ratings yet
Python Programming: Learn, Code, Create
From Everand
Python Programming: Learn, Code, Create
Sachin Naha
No ratings yet
Windows Batch File Programming
From Everand
Windows Batch File Programming
Michael Elliott
2/5 (2)
Abstract:: How To Create An Online Corpus
No ratings yet
Abstract:: How To Create An Online Corpus
13 pages
20 Windows Tools Every SysAdmin Should Know
From Everand
20 Windows Tools Every SysAdmin Should Know
padmin
5/5 (2)
Podcasting - Creating your feed (Part 3)
From Everand
Podcasting - Creating your feed (Part 3)
Donna Eyestone
No ratings yet
PYTHON PROGRAMMING
From Everand
PYTHON PROGRAMMING
Ramsey Hamilton
4/5 (12)
Python Programming Reference Guide: A Comprehensive Guide for Beginners to Master the Basics of Python Programming Language with Practical Coding & Learning Tips
From Everand
Python Programming Reference Guide: A Comprehensive Guide for Beginners to Master the Basics of Python Programming Language with Practical Coding & Learning Tips
Coleman Newton
No ratings yet
Python Programming: 8 Simple Steps to Learn Python Programming Language in 24 hours! Practical Python Programming for Beginners, Python Commands and Python Language
From Everand
Python Programming: 8 Simple Steps to Learn Python Programming Language in 24 hours! Practical Python Programming for Beginners, Python Commands and Python Language
Norman James
2/5 (1)
COMPUTER PRODUCTIVITY BOOK 1 Use AutoHotKey Create your own personal productivity scripts: AutoHotKey productivity, #1
From Everand
COMPUTER PRODUCTIVITY BOOK 1 Use AutoHotKey Create your own personal productivity scripts: AutoHotKey productivity, #1
Max Drake
No ratings yet
Principles of Programming: Java Level 1
From Everand
Principles of Programming: Java Level 1
Jonathan Frank
No ratings yet
ICDL Spreadhseets
From Everand
ICDL Spreadhseets
Michael Anderson
4/5 (2)
The Nuclear Method for Smashwords Authors
From Everand
The Nuclear Method for Smashwords Authors
Emma Wayne Porter
No ratings yet
Angular Workshop: From Beginner to Pro, Creating Applications for the Real World
From Everand
Angular Workshop: From Beginner to Pro, Creating Applications for the Real World
Abdelfattah Ragab
No ratings yet
Make Bootstrap Themes
From Everand
Make Bootstrap Themes
Bo Feng
No ratings yet
Computer Skill
From Everand
Computer Skill
Adam
No ratings yet
How To Speed Up Computer: Your Step-By-Step Guide To Speeding Up Computer
From Everand
How To Speed Up Computer: Your Step-By-Step Guide To Speeding Up Computer
HowExpert
No ratings yet
Introduction to digital tools
No ratings yet
Introduction to digital tools
23 pages
Python Programming
From Everand
Python Programming
Brian Evenson
No ratings yet
Ultimate Hacking Challenge: Hacking the Planet, #3
From Everand
Ultimate Hacking Challenge: Hacking the Planet, #3
sparc Flow
5/5 (2)
CHERIoT Programmers' Guide: CHERIoT, #1
From Everand
CHERIoT Programmers' Guide: CHERIoT, #1
David Chisnall
No ratings yet
The 1 Page Python Book
From Everand
The 1 Page Python Book
Barani Kumar
2/5 (1)
How to Upgrade Captiva InputAccel
From Everand
How to Upgrade Captiva InputAccel
Cooper Faust
No ratings yet
Python Programming Illustrated For Beginners & Intermediates: “Learn By Doing” Approach-Step By Step Ultimate Guide To Mastering Python: The Future Is Here!: The Future Is Here!
From Everand
Python Programming Illustrated For Beginners & Intermediates: “Learn By Doing” Approach-Step By Step Ultimate Guide To Mastering Python: The Future Is Here!: The Future Is Here!
William Sullivan
4/5 (2)
Python Programming Illustrated For Beginners & Intermediates“Learn By Doing” Approach-Step By Step Ultimate Guide To Mastering Python: The Future Is Here!
From Everand
Python Programming Illustrated For Beginners & Intermediates“Learn By Doing” Approach-Step By Step Ultimate Guide To Mastering Python: The Future Is Here!
William Sullivan
3/5 (1)
Essential Python 3
From Everand
Essential Python 3
Kevin Vans-Colina
No ratings yet
Python For Data Science
From Everand
Python For Data Science
Kevin Clark
No ratings yet
Antconc Help
No ratings yet
Antconc Help
21 pages
Python Programming: Your Advanced Guide To Learn Python in 7 Days
From Everand
Python Programming: Your Advanced Guide To Learn Python in 7 Days
Maurice J. Thompson
No ratings yet
Help
No ratings yet
Help
20 pages
Python for Beginners: An Introduction to Learn Python Programming with Tutorials and Hands-On Examples
From Everand
Python for Beginners: An Introduction to Learn Python Programming with Tutorials and Hands-On Examples
Nathan Metzler
4/5 (2)
Python 3 Programming: A Beginner Crash Course Guide to Learn Python 3 in 1 Week
From Everand
Python 3 Programming: A Beginner Crash Course Guide to Learn Python 3 in 1 Week
Timothy C. Needham
3.5/5 (3)
Linux 5 Day Introduction Course
From Everand
Linux 5 Day Introduction Course
Stephen Edwards
No ratings yet
Microsft Word: The Guide You Wish It Came With, #1
From Everand
Microsft Word: The Guide You Wish It Came With, #1
Ben Taylor
5/5 (1)
Macros & Basic with OpenOffice Calc
From Everand
Macros & Basic with OpenOffice Calc
Remy Lentzner
No ratings yet
Windows 8 Apps Programming Genius: 7 Easy Steps To Master Windows 8 Apps In 30 Days: Learning How to Use Windows 8 Efficiently
From Everand
Windows 8 Apps Programming Genius: 7 Easy Steps To Master Windows 8 Apps In 30 Days: Learning How to Use Windows 8 Efficiently
Jason Scotts
No ratings yet
Use LibreOffice Calc: A Beginners Guide
From Everand
Use LibreOffice Calc: A Beginners Guide
Thomas Ecclestone
No ratings yet
ICDL Word Processing
From Everand
ICDL Word Processing
Michael Anderson
No ratings yet
Use LibreOffice Impress: A Beginners Guide
From Everand
Use LibreOffice Impress: A Beginners Guide
Thomas Ecclestone
No ratings yet
Getting Started with Docker: Master the Art of Containerization with Docker
From Everand
Getting Started with Docker: Master the Art of Containerization with Docker
Nigel Poulton
No ratings yet
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
From Everand
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
Adam Freeman
No ratings yet
Jump Start Git
From Everand
Jump Start Git
Shaumik Daityari
No ratings yet
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
From Everand
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
Miguel Miranda de Mattos
No ratings yet
Ghid Utilizare ANTCONC 3.5.9
No ratings yet
Ghid Utilizare ANTCONC 3.5.9
12 pages
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
From Everand
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
Mark Chan
5/5 (4)
Angular for Kids: Start Your Coding Adventure
From Everand
Angular for Kids: Start Your Coding Adventure
Abdelfattah Ragab
No ratings yet
Antconc: Design and Development of A Freeware Corpus Analysis
No ratings yet
Antconc: Design and Development of A Freeware Corpus Analysis
9 pages
Marketing Agency
No ratings yet
Marketing Agency
4 pages
What Is Keyframe in Video Editing With AI
No ratings yet
What Is Keyframe in Video Editing With AI
6 pages
Compiler Construction
No ratings yet
Compiler Construction
7 pages
Bs Computer Science
No ratings yet
Bs Computer Science
3 pages
Rescue Agents - RoboCup
No ratings yet
Rescue Agents - RoboCup
5 pages
Computer Communication & Networks: Physical Layer: Data & Signals
No ratings yet
Computer Communication & Networks: Physical Layer: Data & Signals
47 pages
Testing of High Voltage Equipment
No ratings yet
Testing of High Voltage Equipment
2 pages
Dilla University
100% (1)
Dilla University
34 pages
Rajasuzana@umk Edu My
No ratings yet
Rajasuzana@umk Edu My
5 pages
Structured Programming Language (UPDATED)
No ratings yet
Structured Programming Language (UPDATED)
12 pages
P350P5 P400e5 (1PP) GB (1108)
No ratings yet
P350P5 P400e5 (1PP) GB (1108)
1 page
DJI Ronin RavenEye Image Transmission System Compatibility List en
No ratings yet
DJI Ronin RavenEye Image Transmission System Compatibility List en
1 page
Dynamometer Report
No ratings yet
Dynamometer Report
8 pages
Lec-2.6 Color CRT Display
No ratings yet
Lec-2.6 Color CRT Display
9 pages
WPS PROJECT
No ratings yet
WPS PROJECT
2 pages
Process Dynamics and Control 3rd Edition Chapter 12 Problem 3E Solution
0% (1)
Process Dynamics and Control 3rd Edition Chapter 12 Problem 3E Solution
8 pages
Grid Connected Inverter Design Guide
100% (1)
Grid Connected Inverter Design Guide
52 pages
Learning Outcome Assessment Criteria Guidelines and Range The Candidate Provides Evidence That They Understand
No ratings yet
Learning Outcome Assessment Criteria Guidelines and Range The Candidate Provides Evidence That They Understand
6 pages
DB Campus Drive Preparation Materials Geeks4Geeks
No ratings yet
DB Campus Drive Preparation Materials Geeks4Geeks
14 pages
Chapter 1. Diodes and Applications
No ratings yet
Chapter 1. Diodes and Applications
51 pages
Ds 1
No ratings yet
Ds 1
29 pages
digital-tourism-
No ratings yet
digital-tourism-
2 pages
Software Assurance Program Guide
No ratings yet
Software Assurance Program Guide
13 pages
Computer Networks CourseOutline Fall 2019
No ratings yet
Computer Networks CourseOutline Fall 2019
3 pages
Mvaj 05-10-20 Tripping and Control Relays
No ratings yet
Mvaj 05-10-20 Tripping and Control Relays
9 pages
DLI Watchman®: Vibration Screening Tool Benefits
No ratings yet
DLI Watchman®: Vibration Screening Tool Benefits
2 pages
Mohit CV
No ratings yet
Mohit CV
3 pages
Pvtrin Checklist Practical Tips On PV Installations en
100% (2)
Pvtrin Checklist Practical Tips On PV Installations en
22 pages
Electrical Materials
No ratings yet
Electrical Materials
8 pages
Digital Communication
100% (2)
Digital Communication
220 pages
DX10Z Doosan Parts Inner Engine
No ratings yet
DX10Z Doosan Parts Inner Engine
22 pages
OCP - 4 Handling of Reinforced Steel
No ratings yet
OCP - 4 Handling of Reinforced Steel
3 pages
Excel Password Unlocker v4
No ratings yet
Excel Password Unlocker v4
2 pages
Pegasus
No ratings yet
Pegasus
7 pages

AntCorGen Manual

Uploaded by

AntCorGen Manual

Uploaded by

AntCorGen (Windows, MacOS, Linux)

Getting Started (No installation necessary)

Step 2: Choose documents to be included in the corpus collection.

Text Clustering - Quick Guide

Copyright: Laurence Anthony 2024

You might also like