Environment setup

From LING073
Revision as of 16:19, 6 January 2018 by Jwashin1 (talk | contribs) (Using git for version control)

Jump to: navigation, search

Getting accounts for the lab machines

You'll want access to the lab machines for working on assignments and projects. You can also use your own system if you want, but getting the tools set up can be difficult, and it's better if everyone begins with equal access to the tools.

Day 1 activity:

  • If you do not yet have an account to access the lab machines, Jeff will come around and make you an account. Log in and reset your password according to the instructions.
  • Once/if you do have an account, check to see if those around you need any help getting an account.

Using GNU/Linux

Using a desktop installation of GNU/Linux these days is much like using other desktop operating systems, like Windows or macOS, with the added importance of the command line.

The exercises below will familiarise you with a few aspects of using a GNU/Linux environment that we'll be encountering a lot in this course.

Editing a wiki

Wikis offer a great way to collaboratively publish information online. Using this course wiki, you will be publishing publicly-visible materials related to this class throughout the semester. All submissions are logged, so vandalism is easily identified and reverted, and the timestamp of any submitted work is also available.

Day 1 activity:

  • Log in to the class wiki using your Swarthmore login credentials (link above), and create a user page. Your user page will have a url ending in User:student1, where "student1" is your username. It's also linked to at the top of the page after you log in (as your username in red).
    • tl;dr: log in, click your username at the top, and click the "Create" tab.
  • Write a short introduction including any information you care to share (e.g., name, major, etc.) and a short bulleted list of a couple languages you're considering for your work in this class (or any languages you're interested in generally, if you haven't thought about this yet). You can use mediawiki markup to format what you write.
  • Add the page to the category sp18_students by adding [[Category:sp18_students]] at the very bottom of the page.
  • Enter an edit summary and click "Save page" when you're done.

Using IRC

Internet Relay Chat is a tried-and-tested technology that allows for convenient real-time online messaging for large groups of people, and also between individuals.

The official IRC channel for this class (and unofficial Swarthmore Computational Linguistics IRC channel) is #cl on sccs.swarthmore.edu, kindly hosted by SCCS. (The server must be accessed through the campus network or VPN. It should also be accessible from BiCo networks.)

There are good, Free IRC clients for nearly every operating system. Many GNU/Linux installations come with HexChat, which presents a good balance of features and ease of use.

We will be using IRC as a way to interact outside of the classroom, and sometimes in. It can be a good way to ask me for help or clarification on an assignment, in which case your classmates could benefit too. You can also use it to have general discussion among yourselves, or work remotely in groups. Keep in mind that any arbitrary channel can be started by any user, and you can always tell which users are in a channel.

There's also an IRC channel for Apertium. We will be using Apertium and related software throughout the course, so this can be a good place to ask for assistance and tech support.

Day 1 activity:

  • Start up HexChat on your machine
  • Add the network (see above), set your username (e.g., to match your Swarthmore login name), and connect to the network.
  • Join the class channel (see above), and greet anyone already in the channel. Try to stay to greet others who join.

Using a command line

The command line makes a linux system particularly powerful, and is useful for activities such as those we'll be doing in this class. There is an endless amount of information online about how to use the command line.

Day 1 activity #1:

  1. Open a terminal, and create a directory for this class. The easiest would be by typing something like mkdir ling073. Then change into that directory, e.g. cd ling073.
  2. Referencing a guide to using the linux command line like this one, use cat to create a file in that new directory, use nano to change the contents, and use cat to display the new contents of the file.
  3. When you're done, see if anyone else needs help doing this.

Day 1 activity #2:

  1. Follow the "Get the data" instructions here to download the code and linguistic data for an existing Italian-to-Sardinian machine translation system.
  2. Use a guide to using the linux command line like this one to get the last line of the Italian morphological analyser—specifically the file named apertium-ita.ita.dix. Once you've found it, do not announce what you've found—instead, help your neighbour find it, or if they already have, see if it's the same.

Day 1 activity #3:

  1. Finish the compiling instructions for the Italian-Sardinian translation system.
  2. Test the system as described. Try a different Italian input sentence: "Un corpus è un insieme di testi che sono stati selezionati in modo da essere funzionali per l'analisi linguistica." (source). What's the output? What do you think the * symbol might mean?

Options for text editors

In this class you will be writing a lot of code (though no programming is required!) so you will want to get comfortable with a text editor. This article outlines the main options available to you in a vanilla GNU/Linux environment. If you aren't familiar with any text editors, I recommend gedit for everyday use and nano for use within a terminal.

Using git for version control

Git is software that tracks collaborative work on projects. It stores work in "repositories" and gives you access to previous "revisions". You "clone" an existing repository. You can "commit" changes to your local copy, and "push" the commits back to a remote server. You should "pull" from the server if you think someone else may have also pushed commits. There is info available about using git at Swarthmore, which includes these commands.

Day 1 activity:

  1. Follow these instructions to set up an SSH key
  2. Open a terminal and go into your Source directory (cd ~/Source)
  3. Clone the day1 repository (git clone git@github.swarthmore.edu:Ling073-sp18/day1.git)
  4. Add a file named your username in the new directory, and add a short message. You can use a text editor for this, or just do it on the command line (cd day1; cat > student1; type your message; hit Enter followed by Ctrl-C)
  5. Add the file (git add student1), then commit it (git commit student1 -m "adding my new file")
  6. Push your changes (git push) and pull the changes of your classmates (git pull). Once your classmates have pushed and you've pulled, you can read their messages (cat student2).