Difference between revisions of "Environment setup"

From LING073
Jump to: navigation, search
(Using git for version control)
(Using git for version control)
 
(19 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
== Logging in on the lab machines ==
 
== Logging in on the lab machines ==
  
You'll want access to the Swarthmore CS lab machines for working on assignments and projects. You can also [[Using the tools on your own system|use your own system]] (optional!) if you want, but getting the tools set up can be difficult, and it's better if everyone begins with equal access to the tools, so I '''recommend against it for now'''.  Instead, set up your account on the CS lab machines, and see activities below for connecting to and interacting with them.
+
You'll want access to the Swarthmore CS lab machines for working on assignments and projects.
 +
 
 +
You can also [[Using the tools on your own system|use your own system]] (optional!) if you want, but getting the tools set up can be difficult, and it's better if everyone begins with equal access to the tools, so I '''recommend against it for now'''.  Instead, set up your account on the CS lab machines, and see activities below for connecting to and interacting with them.
  
{{notice|Do this activity only when you are able to visit the lab in person!}}
 
 
'''Day 1 activity #1: get comfortable with lab machines''':
 
'''Day 1 activity #1: get comfortable with lab machines''':
 
* If you haven't previously used the CS lab machines, you should have received an email for setting up login information.  Set up an account according the the instructions.
 
* If you haven't previously used the CS lab machines, you should have received an email for setting up login information.  Set up an account according the the instructions.
Line 19: Line 20:
 
* Log in to the class wiki using your Swarthmore login credentials (link at very top of any page on this wiki), and create a user page.  Your user page will have a URL ending in <code>User:student1</code>, where "student1" is your username.  It's also linked to at the top of the page after you log in (as your username in red).
 
* Log in to the class wiki using your Swarthmore login credentials (link at very top of any page on this wiki), and create a user page.  Your user page will have a URL ending in <code>User:student1</code>, where "student1" is your username.  It's also linked to at the top of the page after you log in (as your username in red).
 
** '''TL;DR''': log in, click your username at the top, and click the "Create" tab.
 
** '''TL;DR''': log in, click your username at the top, and click the "Create" tab.
* '''Write a short introduction''' including any information you care to share (e.g., name, major, etc.) and a short bulleted list of a couple languages you're considering for your work in this class (or any languages you're interested in generally, if you haven't thought about this yet).  You should use [https://www.mediawiki.org/wiki/Help:Formatting mediawiki markup] to format what you write.
+
* '''Write a short introduction''' including any information you care to share (e.g., name, major, etc.) and a short bulleted list of a couple languages you're considering for your work in this class (or any languages you're interested in generally, if you haven't thought about this yet).  If there's someone in the course you'd like to work with this semester, you can mention that here too.
** '''Warning: anything you put on the wiki is publicly visible.'''  You may not want to include personal information like your name.  If you prefer for your work in this course to not even be linked to your Swarthmore username, let me know and I will attempt to make an accommodation.
+
** You should use [https://www.mediawiki.org/wiki/Help:Formatting mediawiki markup] to format what you write.
* Add the page to the category <code>[[:Category:sp22_students|sp22_students]]</code> by adding <code><NOWIKI>[[Category:sp22_students]]</NOWIKI></code> at the very bottom of the page.
+
** {{Notice-inline|'''Warning: anything you put on the wiki is publicly visible.'''}} You may not want to include personal information like your name.  If you prefer for your work in this course to not even be linked to your Swarthmore username, let me know and I will attempt to make an accommodation.
 +
* Add the page to the category <code>[[:Category:sp23_students|sp23_students]]</code> by adding <code><NOWIKI>[[Category:sp23_students]]</NOWIKI></code> at the very bottom of the page.
 
* Enter an edit summary and click "Save page" when you're done.  An edit summary is a way of keeping track of what you did, so you could say something like "wrote a short introduction".
 
* Enter an edit summary and click "Save page" when you're done.  An edit summary is a way of keeping track of what you did, so you could say something like "wrote a short introduction".
 
* All pages added to the category above will be listed on the category page—double check that yours is, and feel free to peruse what your classmates wrote.
 
* All pages added to the category above will be listed on the category page—double check that yours is, and feel free to peruse what your classmates wrote.
Line 30: Line 32:
 
The official IRC channel for this class (and unofficial Swarthmore Computational Linguistics IRC channel) is <code>#swatling</code> on [https://oftc.net/ irc.oftc.net].
 
The official IRC channel for this class (and unofficial Swarthmore Computational Linguistics IRC channel) is <code>#swatling</code> on [https://oftc.net/ irc.oftc.net].
  
There are good, Free IRC clients for nearly every operating system.  Many GNU/Linux installations come with HexChat, which presents a good balance of features and ease of use.  It's available for Windows too (it should be free).  Limechat is a good option for macOS (also should be free).  There are good clients for mobile OSes also.  Plenty of IRC clients are free—you shouldn't have to pay anything—and some of those are Free, meaning you have the freedom to redistribute or modify them.
+
There are good, Free IRC clients for nearly every operating system.  Many GNU/Linux installations (including the '''lab computers''') come with HexChat, which presents a good balance of features and ease of use.  It's available for Windows too (it should be free).  Limechat is a good option for macOS (also should be free).  There are good clients for '''mobile OSes''' also.  Plenty of IRC clients are free—you shouldn't have to pay anything—and some of those are Free as well, meaning you have the freedom to redistribute or modify them.
  
If you prefer a more "modern" feel (with persistent conversation history even when you're not logged in), you may also [[Connecting to OFTC using Matrix|connect to IRC using Matrix]] and any of its clients (e.g., element.io).
+
If you prefer a more "modern" feel (with persistent conversation history even when you're not logged in), you may also '''[[Connecting to OFTC using Matrix|connect to IRC using Matrix]]''' and any of its clients (e.g., element.io).
  
You can also join easily through a web browser by using [http://webchat.oftc.net/?channels=swatling http://webchat.oftc.net/?channels=swatling].  I recommend against this as a solution for the semester—it's better to use it as a once-off backup solution.
+
You can also join easily '''through a web browser''' by using [http://webchat.oftc.net/?channels=swatling http://webchat.oftc.net/?channels=swatling].  I recommend against this as a solution for the semester—it's better to use it as a once-off backup solution.
  
 
We will be using IRC as a way to interact outside of the classroom, and sometimes in.  It can be a good way to ask me for help or clarification on an assignment, in which case your classmates could benefit too.  You can also use it to have general discussion among yourselves, or work remotely in groups.  Keep in mind that any arbitrary channel can be started by any user, and you can always tell which users are in a channel.  <!-- Unofficial channels on Freenode should begin with <code>##</code>, while official ones may begin with <code>#</code> -->
 
We will be using IRC as a way to interact outside of the classroom, and sometimes in.  It can be a good way to ask me for help or clarification on an assignment, in which case your classmates could benefit too.  You can also use it to have general discussion among yourselves, or work remotely in groups.  Keep in mind that any arbitrary channel can be started by any user, and you can always tell which users are in a channel.  <!-- Unofficial channels on Freenode should begin with <code>##</code>, while official ones may begin with <code>#</code> -->
Line 41: Line 43:
  
 
'''Day 1 activity #3: log in to IRC''':
 
'''Day 1 activity #3: log in to IRC''':
# Start up an IRC client
+
# Start up an IRC client (see above paragraphs for options)
 
# Connect to OFTC, set your username (e.g., to match your Swarthmore login name), and connect to the network.
 
# Connect to OFTC, set your username (e.g., to match your Swarthmore login name), and connect to the network.
 
# Join the class channel (see above), and greet anyone already in the channel.  Stay until the end of class to greet others who join.
 
# Join the class channel (see above), and greet anyone already in the channel.  Stay until the end of class to greet others who join.
 
# Did you notice the channel "topic" / welcome message?
 
# Did you notice the channel "topic" / welcome message?
  
=== Using a command line via ssh ===
+
=== Using a command line ===
  
 
The command line makes a linux system particularly powerful, and is useful for activities such as those we'll be doing in this class.  There is an endless amount of information online about how to use the command line.
 
The command line makes a linux system particularly powerful, and is useful for activities such as those we'll be doing in this class.  There is an endless amount of information online about how to use the command line.
  
The easiest way to access all the tools we'll be using for class is to remotely connect to the CS lab machines using ssh.  On <b>Windows</b>, you can [https://www.howtogeek.com/336775/how-to-enable-and-use-windows-10s-built-in-ssh-commands/ use Powershell to access ssh] (putty, which can be downloaded, is another option).  On <b>macOS</b>, simply start the Terminal app.  On GNU/Linux, you can open a terminal of your choice.
+
'''Day 1 activity #4: use some basic commands'''
 
 
'''Day 1 activity #4: connect to the the lab machines remotely'''
 
# Connect to the lab computers over ssh.  Once in a terminal with ssh access (see above), you can do something like the following (replacing "student1" with your username):
 
  ssh student1@lab.cs.swarthmore.edu
 
# You'll be prompted for your password.  You won't be able to see anything when you type it—that's normal.
 
# The prompt should tell you which computer you connected to.  Is it a cooking ingredient, a pokémon character, or something else?  (These are some of the naming schemes used for computers in different physical CS lab rooms at Swarthmore.)
 
 
 
'''Day 1 activity #5: use some basic commands'''
 
 
# Once you're connected, create a directory for this class.  The easiest would be by typing something like <code>mkdir ling073</code>.  Then change into that directory, e.g. <code>cd ling073</code>.
 
# Once you're connected, create a directory for this class.  The easiest would be by typing something like <code>mkdir ling073</code>.  Then change into that directory, e.g. <code>cd ling073</code>.
 
# Referencing a [http://www.makeuseof.com/tag/a-quick-guide-to-get-started-with-the-linux-command-line/ guide to using the linux command line like this one], use <code>cat</code> to create a file in that new directory, use <code>nano</code> to change the contents, and use <code>cat</code> to display the new contents of the file.
 
# Referencing a [http://www.makeuseof.com/tag/a-quick-guide-to-get-started-with-the-linux-command-line/ guide to using the linux command line like this one], use <code>cat</code> to create a file in that new directory, use <code>nano</code> to change the contents, and use <code>cat</code> to display the new contents of the file.
 
# When you're done, see if anyone else needs help doing this.
 
# When you're done, see if anyone else needs help doing this.
  
'''Day 1 activity #6: download linguistic data''':
+
'''Day 1 activity #5: download linguistic data''':
 
# Follow [[Installing_existing_Apertium_data#Get_the_data|the "Get the data" instructions here]] (but not the compiling instructions, yet) to download the code and linguistic data for an existing Italian-to-Sardinian machine translation system.
 
# Follow [[Installing_existing_Apertium_data#Get_the_data|the "Get the data" instructions here]] (but not the compiling instructions, yet) to download the code and linguistic data for an existing Italian-to-Sardinian machine translation system.
 
# Use a [http://www.makeuseof.com/tag/a-quick-guide-to-get-started-with-the-linux-command-line/ guide to using the linux command line like this one] to get the last line of the Italian morphological analyser—specifically the file named <code>apertium-ita.ita.dix</code>.  Once you've found it, do '''not''' announce what you've found—instead, help your neighbour find it, or if they already have, see if it's the same.
 
# Use a [http://www.makeuseof.com/tag/a-quick-guide-to-get-started-with-the-linux-command-line/ guide to using the linux command line like this one] to get the last line of the Italian morphological analyser—specifically the file named <code>apertium-ita.ita.dix</code>.  Once you've found it, do '''not''' announce what you've found—instead, help your neighbour find it, or if they already have, see if it's the same.
  
'''Day 1 activity #7: compile a machine translation system''':
+
'''Day 1 activity #6: compile a machine translation system''':
 
# Finish the [[Installing existing Apertium data|compiling instructions]] for the Italian-Sardinian translation system.
 
# Finish the [[Installing existing Apertium data|compiling instructions]] for the Italian-Sardinian translation system.
 
# Test the system as described.  Try a different Italian input sentence: "Un corpus è un insieme di testi che sono stati selezionati in modo da essere funzionali per l'analisi linguistica." ([https://it.wikipedia.org/wiki/Linguistica_computazionale source]).  What's the output?  What do you think the <code>*</code> symbol might mean?
 
# Test the system as described.  Try a different Italian input sentence: "Un corpus è un insieme di testi che sono stati selezionati in modo da essere funzionali per l'analisi linguistica." ([https://it.wikipedia.org/wiki/Linguistica_computazionale source]).  What's the output?  What do you think the <code>*</code> symbol might mean?
Line 75: Line 69:
 
In this class you will be writing a lot of code (though no programming is required!) so you will want to get comfortable with a text editor.
 
In this class you will be writing a lot of code (though no programming is required!) so you will want to get comfortable with a text editor.
  
The easiest option for everyone will be to use [https://atom.io/ Atom], with a series of plugins.  If there's another editor you're comfortable with, feel free to use that instead.
+
The easiest option for everyone will be to use [https://code.visualstudio.com/ VS Code], with a series of extensions.  If there's another editor you're comfortable with, feel free to use that instead.
 +
 
 +
'''Day 1 activity #7: use a text editor'''
 +
# Open VS Code on the lab machine.
 +
# Find and open the dictionary that you opened in the earlier activity.
 +
# It should auto-detect that the file is in an XML dialect and enable syntax highlighting (i.e., colourising the text so it's easier to read).
 +
#* If it doesn't colourise the text, click the "Select Language Mode" button at the bottom of VS Code (probably the third from the right, will probably be labelled "Plain Text"), and choose XML.
  
'''Day 1 activity #8: install a text editor'''
+
There will be instructions available soon on how to connect to the lab machines remotely using an extension.
# Install [https://atom.io/ Atom].
 
# Now install the following packages (depending on your version, either File > Settings > Install or Edit > Preferences > Install): teletype, ftp-remote-edit, and language-apertium.
 
#* If you're on Windows and get an error about needing the latest version of Visual Studio, try running <code>npm install --g --production windows-build-tools</code> on an admin command line and try again.  Alternatively, try installing "VS 2019 C++ x64/x86 build tools" and "Desktop development with C++" in the Visual Studio installer.  See [https://stackoverflow.com/questions/57541402/node-gyp-configure-got-gyp-err-find-vs this solution] for more information.  This plugin is just for syntax highlighting, so it's not the end of the world if you can't get it working.
 
# [https://www.cs.swarthmore.edu/newhelp/atom-remote-edit.html Set up ftp-remote-edit] to connect to the CS lab servers.  If you skip step 5, the following steps won't make any sense.
 
# Once you've done all this, you should be able to find and open the dictionary that you opened in the earlier activity.
 
  
'''Day 1 activity #9 (optional): collaboratively edit files'''
+
You can also collaborate with a partner remotely using an extension which allows you to edit the same file simultaneously.
# Teletype allows you to collaboratively edit files.  Try it out with someone nearby:
 
## Open one or more files.
 
## Click the antenna icon at the bottom of the screen.
 
## Click "share" or "join".  If you're sharing, send the session id to your partner.
 
## See how it works!
 
  
If you prefer to use an editor other than Atom on your own machine, you can use [https://www.cs.swarthmore.edu/~zpalmer/guides/sshfs/ sshfs] (instructions here from macOS and Windows) to make a directory on the CS lab machine act like a folder on your computer.  To use sshfs in GNU/Linux you can follow a tutorial like [https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh this one].
+
If you prefer to use an editor other than VS Code on your own machine, you can use [https://www.cs.swarthmore.edu/~zpalmer/guides/sshfs/ sshfs] (instructions here from macOS and Windows) to make a directory on the CS lab machine act like a folder on your computer.  To use sshfs in GNU/Linux you can follow a tutorial like [https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh this one].
  
 
== Using git for version control ==
 
== Using git for version control ==
Line 100: Line 90:
 
GitHub is a service that stores repositories and provides access both using git clients and a web interface.  You can think of it as kind of like a much more advanced version of Google Drive.
 
GitHub is a service that stores repositories and provides access both using git clients and a web interface.  You can think of it as kind of like a much more advanced version of Google Drive.
  
'''Day 1 activity #10: make a git repository''':
+
'''Day 1 activity #8: make a git repository''':
 
# Use ssh to connect to the CS lab machines.
 
# Use ssh to connect to the CS lab machines.
 
# Follow [https://www.cs.swarthmore.edu/~newhall/cs31/resources/githubsetup.html these instructions to set up an SSH key] (unless you have done this before).
 
# Follow [https://www.cs.swarthmore.edu/~newhall/cs31/resources/githubsetup.html these instructions to set up an SSH key] (unless you have done this before).
 +
#* You'll want to set up a single SSH key for the lab machines, and then one for each additional system you use (like your own computer).
 
# On the command line, go into your <code>ling073</code> directory (<code>cd ~/ling073</code>)
 
# On the command line, go into your <code>ling073</code> directory (<code>cd ~/ling073</code>)
# If you completed the wiki activity correctly and I'm keeping up with everyone, I'll have given you access to the [https://github.swarthmore.edu/Ling073-sp22/day1 day1 repository] on Swarthmore github.  Assuming that's done, proceed.
+
# If you completed the wiki activity correctly and I'm keeping up with everyone, I'll have given you access to the [https://github.swarthmore.edu/Ling073-sp23/day1 day1 repository] on Swarthmore github.  If you're a BiCo student (Bryn Mawr or Haverford), you'll also need to have logged into the GitHub web interface at least once.  Assuming that's done, proceed.
# Clone the ''day1'' repository (<code>git clone git@github.swarthmore.edu:Ling073-sp22/day1.git</code>)
+
# Clone the ''day1'' repository (<code>git clone git@github.swarthmore.edu:Ling073-sp23/day1.git</code>)
# Add a file named your username in the new directory, and add a short message.  You can use a text editor for this (good practice using Atom!), or just do it on the command line (<code>cd day1; cat > student1</code>; type your message; hit <kbd>Enter</kbd> followed by <kbd>Ctrl</kbd>+<kbd>C</kbd>).
+
# Add a file named your username in the new directory, and add a short message.  You can use a text editor for this (good practice using VS Code!), or just do it on the command line (<code>cd day1; cat > student1</code>; type your message; hit <kbd>Enter</kbd> followed by <kbd>Ctrl</kbd>+<kbd>C</kbd>).
 
# Add the file to the repository (on the command line: <code>git add student1</code>), then commit it with some comment about what you're doing (<code>git commit student1 -m "adding my new file"</code>)
 
# Add the file to the repository (on the command line: <code>git add student1</code>), then commit it with some comment about what you're doing (<code>git commit student1 -m "adding my new file"</code>)
 
# Push your changes (<code>git push</code>) and pull the changes of your classmates (<code>git pull</code>).  Once your classmates have pushed and you've pulled, you can read their messages (<code>cat student2</code>).
 
# Push your changes (<code>git push</code>) and pull the changes of your classmates (<code>git pull</code>).  Once your classmates have pushed and you've pulled, you can read their messages (<code>cat student2</code>).
 
#* If it gives you an error message when you try to push, you may need to run <code>git pull</code> first, to pull and merge in your classmates' changes first.
 
#* If it gives you an error message when you try to push, you may need to run <code>git pull</code> first, to pull and merge in your classmates' changes first.
 
#* When you pull, you may encounter a screen telling you to type a merge message.  You can just quit this.  If it's in nano, use the shortcuts at the bottom of the screen to save and exit (probably <kbd>Ctrl</kbd>+<kbd>O</kbd> followed by <kbd>Ctrl</kbd>+<kbd>X</kbd>).  If it's in vim, hit <kbd>Esc</kbd>, type <code>:wq</code>, and press <kbd>Enter</kbd>.
 
#* When you pull, you may encounter a screen telling you to type a merge message.  You can just quit this.  If it's in nano, use the shortcuts at the bottom of the screen to save and exit (probably <kbd>Ctrl</kbd>+<kbd>O</kbd> followed by <kbd>Ctrl</kbd>+<kbd>X</kbd>).  If it's in vim, hit <kbd>Esc</kbd>, type <code>:wq</code>, and press <kbd>Enter</kbd>.
# Go to the GitHub web interface for the [https://github.swarthmore.edu/Ling073-sp22/day1 day1 repository] to make sure your file made it there.
+
# Go to the GitHub web interface for the [https://github.swarthmore.edu/Ling073-sp23/day1 day1 repository] to make sure your file made it there.
  
 
== Up next ==
 
== Up next ==
If you've successfully completed all of these activities, you can get started on [[Language selection]], your assignment for the next class meeting.
+
'''If you've successfully completed all of these activities''', you can get started on '''[[Language selection]]''', your assignment for the next class meeting.
  
 
[[Category:Tutorials]]
 
[[Category:Tutorials]]
 
[[Category:Activities]]
 
[[Category:Activities]]

Latest revision as of 12:13, 17 January 2023

Logging in on the lab machines

You'll want access to the Swarthmore CS lab machines for working on assignments and projects.

You can also use your own system (optional!) if you want, but getting the tools set up can be difficult, and it's better if everyone begins with equal access to the tools, so I recommend against it for now. Instead, set up your account on the CS lab machines, and see activities below for connecting to and interacting with them.

Day 1 activity #1: get comfortable with lab machines:

  • If you haven't previously used the CS lab machines, you should have received an email for setting up login information. Set up an account according the the instructions.
  • If you don't have access and didn't get an email, let the instructor or course assistant know right away.
  • Once/if you do have an account, check to see if those around you need any help logging in.

Acclimating to the tools we'll be using in class

The exercises below will familiarise you with a few aspects of the tools we'll be using in the course, including using a command line.

Editing a wiki

Wikis offer a great way to collaboratively publish information online. Using this course wiki, you will be publishing publicly-visible materials related to this class throughout the semester. All submissions are logged, so vandalism is easily identified and reverted, and the timestamp of any submitted work is also available.

Day 1 activity #2: edit this wiki:

  • Log in to the class wiki using your Swarthmore login credentials (link at very top of any page on this wiki), and create a user page. Your user page will have a URL ending in User:student1, where "student1" is your username. It's also linked to at the top of the page after you log in (as your username in red).
    • TL;DR: log in, click your username at the top, and click the "Create" tab.
  • Write a short introduction including any information you care to share (e.g., name, major, etc.) and a short bulleted list of a couple languages you're considering for your work in this class (or any languages you're interested in generally, if you haven't thought about this yet). If there's someone in the course you'd like to work with this semester, you can mention that here too.
    • You should use mediawiki markup to format what you write.
    • Warning: anything you put on the wiki is publicly visible. You may not want to include personal information like your name. If you prefer for your work in this course to not even be linked to your Swarthmore username, let me know and I will attempt to make an accommodation.
  • Add the page to the category sp23_students by adding [[Category:sp23_students]] at the very bottom of the page.
  • Enter an edit summary and click "Save page" when you're done. An edit summary is a way of keeping track of what you did, so you could say something like "wrote a short introduction".
  • All pages added to the category above will be listed on the category page—double check that yours is, and feel free to peruse what your classmates wrote.

Using IRC

Internet Relay Chat is a tried-and-tested technology that allows for convenient real-time online messaging for large groups of people, and also between individuals.

The official IRC channel for this class (and unofficial Swarthmore Computational Linguistics IRC channel) is #swatling on irc.oftc.net.

There are good, Free IRC clients for nearly every operating system. Many GNU/Linux installations (including the lab computers) come with HexChat, which presents a good balance of features and ease of use. It's available for Windows too (it should be free). Limechat is a good option for macOS (also should be free). There are good clients for mobile OSes also. Plenty of IRC clients are free—you shouldn't have to pay anything—and some of those are Free as well, meaning you have the freedom to redistribute or modify them.

If you prefer a more "modern" feel (with persistent conversation history even when you're not logged in), you may also connect to IRC using Matrix and any of its clients (e.g., element.io).

You can also join easily through a web browser by using http://webchat.oftc.net/?channels=swatling. I recommend against this as a solution for the semester—it's better to use it as a once-off backup solution.

We will be using IRC as a way to interact outside of the classroom, and sometimes in. It can be a good way to ask me for help or clarification on an assignment, in which case your classmates could benefit too. You can also use it to have general discussion among yourselves, or work remotely in groups. Keep in mind that any arbitrary channel can be started by any user, and you can always tell which users are in a channel.

There's also an IRC channel for Apertium. We will be using Apertium and related software throughout the course, so this can be a good place to ask for assistance and tech support.

Day 1 activity #3: log in to IRC:

  1. Start up an IRC client (see above paragraphs for options)
  2. Connect to OFTC, set your username (e.g., to match your Swarthmore login name), and connect to the network.
  3. Join the class channel (see above), and greet anyone already in the channel. Stay until the end of class to greet others who join.
  4. Did you notice the channel "topic" / welcome message?

Using a command line

The command line makes a linux system particularly powerful, and is useful for activities such as those we'll be doing in this class. There is an endless amount of information online about how to use the command line.

Day 1 activity #4: use some basic commands

  1. Once you're connected, create a directory for this class. The easiest would be by typing something like mkdir ling073. Then change into that directory, e.g. cd ling073.
  2. Referencing a guide to using the linux command line like this one, use cat to create a file in that new directory, use nano to change the contents, and use cat to display the new contents of the file.
  3. When you're done, see if anyone else needs help doing this.

Day 1 activity #5: download linguistic data:

  1. Follow the "Get the data" instructions here (but not the compiling instructions, yet) to download the code and linguistic data for an existing Italian-to-Sardinian machine translation system.
  2. Use a guide to using the linux command line like this one to get the last line of the Italian morphological analyser—specifically the file named apertium-ita.ita.dix. Once you've found it, do not announce what you've found—instead, help your neighbour find it, or if they already have, see if it's the same.

Day 1 activity #6: compile a machine translation system:

  1. Finish the compiling instructions for the Italian-Sardinian translation system.
  2. Test the system as described. Try a different Italian input sentence: "Un corpus è un insieme di testi che sono stati selezionati in modo da essere funzionali per l'analisi linguistica." (source). What's the output? What do you think the * symbol might mean?

Using a text editor

In this class you will be writing a lot of code (though no programming is required!) so you will want to get comfortable with a text editor.

The easiest option for everyone will be to use VS Code, with a series of extensions. If there's another editor you're comfortable with, feel free to use that instead.

Day 1 activity #7: use a text editor

  1. Open VS Code on the lab machine.
  2. Find and open the dictionary that you opened in the earlier activity.
  3. It should auto-detect that the file is in an XML dialect and enable syntax highlighting (i.e., colourising the text so it's easier to read).
    • If it doesn't colourise the text, click the "Select Language Mode" button at the bottom of VS Code (probably the third from the right, will probably be labelled "Plain Text"), and choose XML.

There will be instructions available soon on how to connect to the lab machines remotely using an extension.

You can also collaborate with a partner remotely using an extension which allows you to edit the same file simultaneously.

If you prefer to use an editor other than VS Code on your own machine, you can use sshfs (instructions here from macOS and Windows) to make a directory on the CS lab machine act like a folder on your computer. To use sshfs in GNU/Linux you can follow a tutorial like this one.

Using git for version control

Git is software that tracks collaborative work on projects. It stores work (files) in "repositories" and gives you access to previous "revisions". You "clone" an existing repository. You can "commit" changes to your local copy, and "push" the commits back to a remote server. You should "pull" from the server if you think someone else may have also pushed commits.

There is info available about using git at Swarthmore, which includes the commands for the actions just described.

GitHub is a service that stores repositories and provides access both using git clients and a web interface. You can think of it as kind of like a much more advanced version of Google Drive.

Day 1 activity #8: make a git repository:

  1. Use ssh to connect to the CS lab machines.
  2. Follow these instructions to set up an SSH key (unless you have done this before).
    • You'll want to set up a single SSH key for the lab machines, and then one for each additional system you use (like your own computer).
  3. On the command line, go into your ling073 directory (cd ~/ling073)
  4. If you completed the wiki activity correctly and I'm keeping up with everyone, I'll have given you access to the day1 repository on Swarthmore github. If you're a BiCo student (Bryn Mawr or Haverford), you'll also need to have logged into the GitHub web interface at least once. Assuming that's done, proceed.
  5. Clone the day1 repository (git clone git@github.swarthmore.edu:Ling073-sp23/day1.git)
  6. Add a file named your username in the new directory, and add a short message. You can use a text editor for this (good practice using VS Code!), or just do it on the command line (cd day1; cat > student1; type your message; hit Enter followed by Ctrl+C).
  7. Add the file to the repository (on the command line: git add student1), then commit it with some comment about what you're doing (git commit student1 -m "adding my new file")
  8. Push your changes (git push) and pull the changes of your classmates (git pull). Once your classmates have pushed and you've pulled, you can read their messages (cat student2).
    • If it gives you an error message when you try to push, you may need to run git pull first, to pull and merge in your classmates' changes first.
    • When you pull, you may encounter a screen telling you to type a merge message. You can just quit this. If it's in nano, use the shortcuts at the bottom of the screen to save and exit (probably Ctrl+O followed by Ctrl+X). If it's in vim, hit Esc, type :wq, and press Enter.
  9. Go to the GitHub web interface for the day1 repository to make sure your file made it there.

Up next

If you've successfully completed all of these activities, you can get started on Language selection, your assignment for the next class meeting.