Environment setup

From LING073
Jump to: navigation, search

Logging in on the lab machines

You'll want access to the Swarthmore CS lab machines for working on assignments and projects. You can also use your own system (optional!) if you want, but getting the tools set up can be difficult, and it's better if everyone begins with equal access to the tools, so I recommend against it for now. Instead, set up your account on the CS lab machines, and see activities below for connecting to and interacting with them.

Day 1 activity:

  • If you haven't previously used the CS lab machines, you should have received an email for setting up login information. Set up an account according the the instructions.
  • If you don't have access and didn't get an email, let the instructor or course assistant know right away.
  • Once/if you do have an account, check to see if those around you need any help logging in.

Acclimating to the tools we'll be using in class

The exercises below will familiarise you with a few aspects of the tools we'll be using in the course, including using a command line.

Editing a wiki

Wikis offer a great way to collaboratively publish information online. Using this course wiki, you will be publishing publicly-visible materials related to this class throughout the semester. All submissions are logged, so vandalism is easily identified and reverted, and the timestamp of any submitted work is also available.

Day 1 activity:

  • Log in to the class wiki using your Swarthmore login credentials (link above), and create a user page. Your user page will have a url ending in User:student1, where "student1" is your username. It's also linked to at the top of the page after you log in (as your username in red).
    • tl;dr: log in, click your username at the top, and click the "Create" tab.
  • Write a short introduction including any information you care to share (e.g., name, major, etc.) and a short bulleted list of a couple languages you're considering for your work in this class (or any languages you're interested in generally, if you haven't thought about this yet). You should use mediawiki markup to format what you write.
    • Warning: anything you put on the wiki is publicly visible. You may not want to include personal information like your name. If you prefer for your work in this course to not even be linked to your Swarthmore username, let me know and I will attempt to make an accommodation.
  • Add the page to the category sp21_students by adding [[Category:sp21_students]] at the very bottom of the page.
  • Enter an edit summary and click "Save page" when you're done.
  • All pages added to the category above will be listed on the category page—double check that yours is!

Using IRC

Internet Relay Chat is a tried-and-tested technology that allows for convenient real-time online messaging for large groups of people, and also between individuals.

The official IRC channel for this class (and unofficial Swarthmore Computational Linguistics IRC channel) is #swatling on irc.freenode.net.

There are good, Free IRC clients for nearly every operating system. Many GNU/Linux installations come with HexChat, which presents a good balance of features and ease of use. It's available for Windows too (it should be free). Limechat is a good option for macOS (also should be free). There are good clients for mobile OSes also. Plenty of IRC clients are free—you shouldn't have to pay anything.

If you prefer a more "modern" feel, you may also connect to IRC using Matrix and any of its clients (e.g., riot.im/element.io). See the instructions here.

We will be using IRC as a way to interact outside of the classroom, and sometimes in. It can be a good way to ask me for help or clarification on an assignment, in which case your classmates could benefit too. You can also use it to have general discussion among yourselves, or work remotely in groups. Keep in mind that any arbitrary channel can be started by any user, and you can always tell which users are in a channel. Unofficial channels on Freenode should begin with ##, while official ones may begin with #.

There's also an IRC channel for Apertium. We will be using Apertium and related software throughout the course, so this can be a good place to ask for assistance and tech support.

Day 1 activity:

  1. Start up an IRC client
  2. Connect to freenode, set your username (e.g., to match your Swarthmore login name), and connect to the network.
  3. Join the class channel (see above), and greet anyone already in the channel. Stay until the end of class to greet others who join.
  4. Did you notice the channel "topic" / welcome message?

Using a command line via ssh

The command line makes a linux system particularly powerful, and is useful for activities such as those we'll be doing in this class. There is an endless amount of information online about how to use the command line.

The easiest way to access all the tools we'll be using for class is to remotely connect to the CS lab machines using ssh. On Windows, you can use Powershell to access ssh (putty, which can be downloaded, is another option). On macOS, simply start the Terminal app. On GNU/Linux, you can open a terminal of your choice.

Day 1 activity #1

  1. Connect to the lab computers over ssh. Once in a terminal with ssh access (see above), you can do something like the following (replacing "student1" with your username):
 ssh student1@lab.cs.swarthmore.edu
  1. You'll be prompted for your password. You won't be able to see anything when you type it—that's normal.
  2. The prompt should tell you which computer you connected to. Is it a cooking ingredient, a pokémon character, or something else?

Day 1 activity #2

  1. Once you're connected, create a directory for this class. The easiest would be by typing something like mkdir ling073. Then change into that directory, e.g. cd ling073.
  2. Referencing a guide to using the linux command line like this one, use cat to create a file in that new directory, use nano to change the contents, and use cat to display the new contents of the file.
  3. When you're done, see if anyone else needs help doing this.

Day 1 activity #3:

  1. Follow the "Get the data" instructions here (but not the compiling instructions, yet) to download the code and linguistic data for an existing Italian-to-Sardinian machine translation system.
  2. Use a guide to using the linux command line like this one to get the last line of the Italian morphological analyser—specifically the file named apertium-ita.ita.dix. Once you've found it, do not announce what you've found—instead, help your neighbour find it, or if they already have, see if it's the same.

Day 1 activity #4:

  1. Finish the compiling instructions for the Italian-Sardinian translation system.
  2. Test the system as described. Try a different Italian input sentence: "Un corpus è un insieme di testi che sono stati selezionati in modo da essere funzionali per l'analisi linguistica." (source). What's the output? What do you think the * symbol might mean?

Using a text editor

In this class you will be writing a lot of code (though no programming is required!) so you will want to get comfortable with a text editor.

The easiest option for everyone will be to use Atom, with a series of plugins. If there's another editor you're comfortable with, feel free to use that.

Day 1 activity:

  1. Install Atom.
  2. Now install the following packages (depending on your version, either File > Settings > Install or Edit > Preferences > Install): teletype, ftp-remote-edit, and language-apertium.
    • If you're on Windows and get an error about needing the latest version of Visual Studio, try running npm install --g --production windows-build-tools on an admin command line and try again. Alternatively, try installing "VS 2019 C++ x64/x86 build tools" and "Desktop development with C++" in the Visual Studio installer. See this solution for more information. This plugin is just for syntax highlighting, so it's not the end of the world if you can't get it working.
  3. Set up ftp-remote-edit to connect to the CS lab servers. If you skip step 5, the following steps won't make any sense.

Day 1 activity:

  1. Teletype allows you to collaboratively edit files. Try it out with someone nearby:
    1. Open one or more files.
    2. Click the antenna icon at the bottom of the screen.
    3. Click "share" or "join". If you're sharing, send the session id to your partner.
    4. See how it works!

If you prefer to use an editor other than Atom on your own machine, you can use sshfs (instructions here from macOS and Windows) to make a directory on the CS lab machine act like a folder on your computer. To use sshfs in GNU/Linux you can follow a tutorial like this one.

Using git for version control

Git is software that tracks collaborative work on projects. It stores work in "repositories" and gives you access to previous "revisions". You "clone" an existing repository. You can "commit" changes to your local copy, and "push" the commits back to a remote server. You should "pull" from the server if you think someone else may have also pushed commits. There is info available about using git at Swarthmore, which includes these commands.

Day 1 activity #4:

  1. Use ssh to connect to the CS lab machines.
  2. Follow these instructions to set up an SSH key
  3. On the command line, go into your ling073 directory (cd ~/ling073)
  4. If you completed the wiki activity correctly and I'm keeping up with everyone, I'll have given you access to the day1 repository on Swarthmore github. Assuming that's done, proceed.
  5. Clone the day1 repository (git clone git@github.swarthmore.edu:Ling073-sp21/day1.git)
  6. Add a file named your username in the new directory, and add a short message. You can use a text editor for this (good practice using Atom!), or just do it on the command line (cd day1; cat > student1; type your message; hit Enter followed by Ctrl+C)
  7. Add the file to the repository (on the command line: git add student1), then commit it with some comment about what you're doing (git commit student1 -m "adding my new file")
  8. Push your changes (git push) and pull the changes of your classmates (git pull). Once your classmates have pushed and you've pulled, you can read their messages (cat student2).
    • If it gives you an error message when you try to push, you may need to run git pull first, to pull and merge in your classmates' changes first.
    • When you pull, you may encounter a screen telling you to type a merge message. You can just quit this. If it's in nano, use the shortcuts at the bottom of the screen to save and exit (probably Ctrl+O followed by Ctrl+X). If it's in vim, type :wq and press Enter.

Up next

If you've successfully completed all of these activities, you can get started on Language selection, your assignment for the next class meeting.