class: center, middle ## W4995 Applied Machine Learning # Git, Github and Testing 01/28/19 Andreas C. Müller ??? Hey. Welcome to the second lecture on applied machine learning. Last week we talked a bit about very general and high-level issues in machine learning. Today, we’ll go right down to the metal. We’ll talk about version control, in particular git, about testing, continuous integration and about documentation. Also, I apologize in advance for the slides. There’s a lot of bullet points and few diagrams and I don’t really like that. I'm reusing material from http://rogerdudler.github.io/git-guide/ and software carpentry FIXME: slide notes, be consistent with $ for bash commands FIXME pytest -> pytest FIXME delete some slides, slide order (see recording?) --- class: center # Why do I care? ![:scale 40%](images/final_doc.jpg) ??? FIXME cutting --- class: center ![:scale 90%](images/do-you-need-git.png) --- class: middle # Git and Github https://guides.github.com/ http://rogerdudler.github.io/git-guide/ ??? FIXME: logos So this first part is about git. I assume you all know about git. Who here has not used git so far? I changed this up a bit from previous years since people seemed not very familiar with git, so this will be pretty introductory and less of the crazy behind the scenes stuff from before. You should really use version control any time you work on any code or any other document. And it looks like we’ll have to use git for now and some time to come. So you all have used git, but who of you thinks they understand git? Ok I’ll keep you in mind and ask you trick questions along the way. --- class: center, middle ![:scale 40%](images/git_xkcd.png) ??? Here’s a comic from xkcd that might seem familiar. I know many people that use git by just memorizing some commands and if things go south, they just delete the repository and start fresh. Who of you have done that? I certainly have. The goal of today is that you never have to throw your hands up ever again. --- class: center, middle ![:scale 40%](images/torvalds_ducreux.jpg) ??? Personally I like to think that git was a practical joke played on the open source community by Linux Torvals. Unfortunately, I think the real issue is that kernel developers, genius as they might be, are not the best at creating user interfaces. Git is great, but the user interface is horrible. But let’s try and understand what’s going on. --- # Configuration ```bash git config --global user.name "Andreas Mueller" git config --global user.email "acm2248@columbia.edu" git config --global color.ui "auto" git config --global core.editor "vim" ``` Show your configuration: ```bash git config --list ``` ??? - git command format: git verb - verb: config - who you are - --global - apply everywhere - color: color code output - editor: set default editor You can change these setting at any time. --- # Creating a repository .smallest[ ```bash $ mkdir homework1 $ cd homework1 ``` ```bash $ git init ``` ``` > Initialized empty Git repository in /tmp/homework1/.git/ ``` ```bash $ ls -a ``` ``` . .. .git ``` ```bash $ git status ``` ``` On branch master No commits yet nothing to commit (create/copy files and use "git add" to track) ``` ] ??? - make homework1 directory a repository - .git directory is where git stores the history of the project you can remove the tracking information by removing the .git folder. So deleting homework1 erases everything. Using "git init" is actually less common if you work with github. --- # Cloning a repository .center[ ![:scale 70%](images/clone_new.png) ] ```bash $ git clone git@github.com:aml-sprint-19/homework-1 ``` ??? SSH requires you to have an ssh key on github. Just do it! Otherwise use https. DONT Actually do this for the homework - we'll be using classroom. This is the most common way to create a repo, usually not git init. --- # Workflow? ![:scale 100%](images/git-staging-area.png) --- # Changing a File .smaller[ ```bash $ echo "print('Hello world!')" >> task1.py ``` ```bash $ git status ``` ``` On branch master No commits yet Untracked files: (use "git add
..." to include in what will be committed) task1.py nothing added to commit but untracked files present (use "git add" to track) ``` ```bash $ git add task1.py ``` ] --- # Viewing your history ```bash $ git commit -m "say hello" ``` ``` [master (root-commit) 7d139fb] say hello 1 file changed, 1 insertion(+) create mode 100644 task1.py ``` ```bash $ git log ``` ``` commit 7d139fb317ecfa7d629654b709747e91ececa444 (HEAD -> master) Author: Andreas Mueller
Date: Mon Jan 28 12:37:29 2019 -0500 say hello ``` --- # A note on viewing changes Changes between working directory and what was last staged ```bash git diff ``` Changes between staging area and last commit ```bash git diff --staged ``` --- # Referencing different versions - Shorthand for different versions of a repository (refers to commits) - Current Version (most recent commit): ``HEAD`` - Version before current: ``HEAD~1`` - Version before that: ``HEAD~2`` - Each of these also has a commit hash - use ``git log`` to get appropriate hash --- # Exploring History Changes made in the last commit ```bash git diff HEAD~1 ``` Changes made in the last 2 commits ```bash git diff HEAD~2 ``` Changes made since commit hash... ```bash git diff 0b0d55e ``` ??? - first 7 characters - use `git log` to find commit you want --- # Recovering Older Versions of Files - Overwrite task1.py: ```bash $ echo "print('goodbye, cruel world!')" > task1.py $ cat task1.py ``` ``` print('goodbye, cruel world!') ``` Recover last recorded version: ```bash $ git checkout HEAD task1.py $ cat task1.py ``` ``` print('Hello world!') ``` ??? * `checkout HEAD` means revert to version in `HEAD` * can use commit hash to revert to even older version * `task1.py`: tells git which file to revert * in `git status` they list this option with `--` instead of `HEAD`. This is a shortcut. --- # Recovering Older Versions of whole repo ```bash $ echo "print('goodbye, cruel world!')" > task1.py $ git add task1.py $ git commit -m " saying goodbye" ``` ``` [master 95f3b39] saying goodbye 1 file changed, 1 insertion(+), 1 deletion(-) ``` $ git log ``` commit 95f3b39402451c474e3887be94d2cc37a11be511 (HEAD -> master) Author: Andreas Mueller
Date: Mon Jan 28 12:44:52 2019 -0500 saying goodbye commit 7d139fb317ecfa7d629654b709747e91ececa444 Author: Andreas Mueller
Date: Mon Jan 28 12:37:29 2019 -0500 say hello ``` --- # Resetting: ```bash $ git reset --hard HEAD~1 ``` ``` HEAD is now at 7d139fb say hello ``` --- # Branches ```bash $ git checkout -b "new_feature" ``` ``` Switched to a new branch 'new_feature' ``` make some changes, add, commit... Moving between branches: ```bash $ git checkout master ``` ``` Switched to branch 'master' ``` changes are not present in master.. --- # Merge - Fast-forward merge: ![:scale 90%](images/git_fast_forward_merge.png) - Merge-commits: ![:scale 90%](images/git_merge_commit.png) ??? Now let's talk about the the more complex operations to the repository or history, merging . They change the commit graph, and possibly the working tree. I find it most helpful to think of them as graph operations on the repository. Let's start with merges. There’s two kinds of merging: feed-forward merges, and merges that require a commit. If one commit is a descendants of the other and you merge them, it will be a feed-forward, and just move the branch. That happens for example if you just pull new changes from a remote repository that you cloned, if you haven't made any local changes. However, if one isn’t a descendant of the other, git will create a “merge commit” that will unite the two, and have both as its parent. Git will attempt its best but if there’s conflicting changes, you might need to resolve the conflicts by hand. --- # Collaborating using git & github --- # Remote repository - central location everyone can see - requires network ? - github, bitbucket - public vs private ![git repo first push](images/github-repo-after-first-push.jpg) --- # Github ??? I want to just briefly talk about github. For the purposes of git, github is just another remote. And it’s really nothing special. In terms of being a remote, you can replace github with a usb stick that you hand around. There is a lot of tools for user management and issue tracking which is great, but not really central what we’re talking about here. The main thing that’s different in using github from using any other remote, is the use of pull requests, and the ability to integrate with remote services. For now lets talk about pull requests, we'll come back to the services later. --- class: center # Github pull request workflow
![:scale 50%](images/forks1.png) ??? So what are pull requests for? Basically they allow you to contribute to a repository to which you don’t have write permissions. Let’s say you want to contribute to scikit-learn. Or to the lecture slides. There’s the main repository, which we usually call "upstream", scikit-learn/scikit-learn. You want to change something there, but you don’t have write permissions. --- class: center # Github pull request workflow
![:scale 50%](images/forks2.png) ??? What you can do is "fork" it on github – that’s a github concept, not a git concept. It’s just a clone that also lives on github. --- class: center # Github pull request workflow
![:scale 50%](images/forks3.png) ??? To actually make any changes, you can then clone this fork locally, add a feature branch, make changes and push them. --- class: center # Github pull request workflow
![:scale 50%](images/forks.png) ??? Then you can ask the owner of the repository if they want to merge your changes. That's a pull request and allows the owner to "pull" in your changes into the main repository. So far, so good. Makes mostly sense, I think. There is a slight oddity in this workflow, though. --- class: center # Github pull request workflow
![:scale 60%](images/forks_master_feature.png) ??? Usually you pull the current status from the master branch, and make some changes that you want to include. So you always pull master and push feature branches. But the code in the upstream repository changes. How do you get these changes onto your laptop? If the original repository gets updated, there is no way to directly update your fork. --- class: center # Github pull request workflow
![:scale 60%](images/forks_master_feature2.png) ??? You have to add upstream as an additional remote, and then you can pull from there. You can create new features on top, and push the feature branches to your fork (which is usually the “origin” remote). But what happens with the master branch on your fork? It never gets updated and there is no point in pushing to it, so it will just sit there and rot, staying at the state of the original repository at the time of your fork. I think that’s kind of weird, but that’s how github works. So you always pull from upstream to get their changes, and push to your fork. --- # Typical Workflow - Clone - Branch - Add, commit, add, commit, add, commit, ... - Merge / rebase - Push ??? So here’s a typical workflow for making a change to an existing project. You clone the project from some remote repository. You change some files, add them, and commit them. Then you push them to the remote repository. That’s more or less what you’ll do with your homework. The advanced version of this is after cloning you create a branch, you make all your changes on the branch, and once you’re done, you merge the changes from your branch into the master branch. Everybody good so far? I want to give some basic tips that will help you even with these simple workflows. --- # Some tips - `$ git status` - Install shell plugins for status and branch (`oh-my-zsh`) ![:scale 90%](images/git_shell_extension.png) - Set editor, pager and diff-tool (check out meld!) - Use `.gitignore` ??? You should always call git status, to see what’s happening with your repository. It will tell you whether you changed anything, and which branch you’re on. I actually highly recommend using a plugin for your shell, that will give you the status in every line. If you use zsh, you can use oh-my-zsh for example. If you use something else like bash, there are also plenty of options out there. You should also set your editor and pager to something that is familiar to you. The default editor is vim, and if you’re not a vim user, you might want to change that. It’s also helpful to set up a diff program that you like. I like meld for example, that allows you to compare whole folders in a nice way. You should also use gitignore. It's a simple text file that allows you to ignore certain files or folders or file types. For example, you could ignore all .pyc files. Or you could ignore all .ipynb-checkpoints or your dataset folder. FIXME add gitignore example --- # Git log ![:scale 80%](images/git_log.png) ??? You should also become friends with git log. Git log allows you to view the history of your repository. It has a couple of very helpful options. Plain git log will have very long output and show full commit messages. If you want short summaries, use oneline. Often, it’s also useful to annotate branches, which you can do with A the decorate option. If you want to show more than just the branch you’re on, you can use the all option. Now it’s a bit hard to track the relations of the different commits, though, and you might want to use the graph option to show the structure of the history. You might want to alias a command like that because it’s rather long to type, but very informative. I’ll show you an alternative in a bit. --- class: center, middle ![:scale 50%](images/git_adog.png) ??? --- class: center, middle # git gui ![:scale 80%](images/git_gui.png) ??? Git gui is basically an interface for git add and git commit. You can also use it to push, but not much more. This is the way I create most of my commits. You can see very easily which files have changed, see the changes, and stage single lines or whole files. Then you enter the commit message here, and click commit. --- class: center, middle # gitk ![:scale 80%](images/gitk.png) ??? Another tool that I use all the time is gitk. This is basically a graphical interface to git log, showing you the history of your project. But you can also use it as a graphical interface for reset, and for cherry-picking, which we won’t go into. Often I run gitk with gitk –all, which will show you all the branches. This is quite similar to the Git log –oneline –anotate –graph –all that I did earlier – but now in a gui! This is usually how I do resets and how I look at the graph before I do any merges or rebases. You can also use gitk to search commit messages, and there are many option on what to display and how. I mostly use the options to selectively show some branches that I’m interested in. --- # Understanding Git - Working directory - Repository (Commit graph, history) - Index (Staging Area) - Branches - Head ??? So now, I want to work towards really understanding git, and here are some lower level concepts that are important. First, the working directory. That’s the actual current content of the folder on disk. Then there is a graph of commits, also known as the history. Formally that's called the repository, though usually I think of the repository as the whole thing. It's a directed acyclic graph that contains all the changes and contains the information about the parents of a commit. Then the index. The index is what's also called the staging area, it's an intermediate space in which you accumulate changes you want to put in a commit. Branches are quite simple. They are just pointers to particular commits. They point to certain commits. Head is another pointer. Head points to the current active branch. That is the branch that will be updated if you make a commit. (unless you’re on no branch, in which case you are headless) The state of you repository is really much more than the state of the directory, it’s the state of all of these five things together. And if you think about commands, you should think about them in terms of what they do to each of these five. --- class: compact # git Commands .smallest[ - .normal[`git add`]
puts files from working director into staging area (index) If not tracked so far, adds them to tracked files. - .normal[`git commit`]
commits files from staging area (index) to repository, moves current branch with HEAD - .normal[`git checkout [
] [
]`]
Set `
` in working directory to state at `
` and stages it. - .normal[`git checkout [-b]
`]
moves HEAD to `
` (-b creates it), changes content of working dir - .normal[`git reset --soft
`]
moves HEAD to `
` (takes the current branch with it) - .normal[`git reset --mixed
`]
moves HEAD to `
`, changes index to be at `
` (but not working directory) - .normal[`git reset --hard
`]
moved HEAD to `
`, changes index and working tree to `
`. ] ??? Ok so now, let’s got through some of the commands and talk about what they do in terms of these five concepts [read slide] --- class: center ![:scale 60%](images/git_data_transport.png) ??? I found this data flow diagram online and I think it's quite helpful for a subset of the commands. It shows how you can propagate changes from the working directory to a remote repository and back. This is only a small part of all the commands, though. For example reset is missing, and generally nothing that moves branches or head is here. --- # reflog .left-column[ `$ git reflog` ![:scale 100%](images/git_reflog.png) ] .right-column[
![:scale 100%](images/git_simple_merge_graph.png) ] ??? The last command I want to mention is reflog. Reflog is a very powerful tool that allows you to go through the history of your project, but not the same was as log. Reflog actually tracks the changes to HEAD that you do with all the crazy commands that we talked about. Log and gitk will only show you commits that are part of branches. If you do a rebase, for example, all the previous commits are still there, but not part of a branch any more. Remember, rebasing creates new hashes, so the rebased commits are different commits. Imagine you want to go back to some states that are not part of any branch any more. Reflog allows you to do that. So if you break something during a rebase, or you lost a commit, you can always find it with reflog! --- class: center, middle # Git for ages 4 and up: https://www.youtube.com/watch?v=1ffBJ4sVUb4 (with play-doh!) ??? For some of you, this was probably too slow, and for some of you, this was probably too fast. Who here knew already all of this? So for those of you who I managed to totally confuse, I recommend you go through these slides again, and also watch this video. It’s pretty good, but it’s also pretty long. And for those of you who though this was wayyy to much detail: this was still not all the important parts. There’s also tags, and bare repositories and the stash and cherries... --- class: center, middle --- class: center, middle # Unit Tests and integration tests ??? So last time we talked about how we can guard ourselves against some simple issues with our syntax checkers, but that doesn’t find all errors, and it doesn’t tell us if our code works. So we now we’ll talk about unit testing and integration testing. Who of you has worked with an automatic testing framework? What is it for and why would we want it? --- class: spacious # Why test? - Ensure that code works correctly. - Ensure that changes don’t break anything. - Ensure that bugs are not reintroduced. - Ensure robustness to user errors. - Ensure code is reachable. ??? So yes, we want to make sure that our code is correct. We also want to make sure that if we rewrite something, it remains correct. If you have good tests, it is much easier to aggressively refactor your code. If your tests are passing, everything works! Another important kind of test is no-regression tests. You found a bug, you fixed it, and now you want to make sure you don’t re-introduce it. The easiest way is to write a test that tests for the bug. You also want to make sure that your program behaves reasonable, even in edge-cases such as invalid input. And finally, testing can help you find code that is actually unreachable by measuring coverage. If you can’t write a test that will reach a particular part of the code, it’s never executed, and you should probably think about that. --- class: center, middle # Test-driven development? ??? I love tests. But there are some people that love tests even more, and they practice what’s called test-driven development. Who here has heard about that? In test driven development, you write the tests before you write the code. And there are advantages to that. I don’t usually do that myself. However, I do test-driven debugging. If I find a bug, I first write a test that fails if the bug is present. I need to do that later anyhow, because I need a non-regression test, and it usually makes debugging much easier. I often do example driven development, where I write down some simple use-cases and then write implementations to fullfil them. --- # Types of tests - Unit tests – function does the right thing. - Integration tests – system / process does the right thing. - Non-regression tests – bug got removed (and will not be reintroduced). ??? I usually think of tests in terms of three kinds: unit tests, that test the smallest possible unit, usually one function. Then, there’s integration tests, that test that the different parts of the software actually work together in the right way. That is often by testing several application scenarios. And finally, there’s non-regression tests, which can be either a unit tests, an integration test, or both, but they are added for a specific scenario that failed earlier, and you’ll accumulate them as your project ages. --- # How to test? - pytest – http://doc.pytest.org - Searches for all test_*.py files, runs all test_* methods. - Reports nice errors! - Dig deeper: http://pybites.blogspot.com/2011/07/behind-scenes-of-pytests-new-assertion.html ??? There are several frameworks to help you with unit testing in python. There’s the built-in unittest module, there’s the now somewhat abandoned nosetests. For the course we’ll be using the pytest module, which is also what I’d recommend for any new projects. What it does is it searches all files starting with `test`, runs them, and reports a summary. The tests should contain assert statements, and if any of them fail, you’ll get an informative error. This is actually done with a considerable amount of magic, which you can read up on here, if you’re into rewriting the AST. --- # Example .smaller[ ```python # content of inc.py def inc(x): return x + 2 # content of test_sample.py from inc import inc def test_answer(): assert inc(3) == 4 ``` ] ![:scale 60%](images/pytest_failure.png) ??? So here I have a slightly modified version of the example from the pytest website. We have a function called inc that’s supposed to increment a number. But there’s a bug: it adds two instead of one. And we have a test, which is called `test_answer`, that calls the inc function with the number three and checks if three is correctly incremented. If we call pytest on this file, or on the folder of this file, pytest will run the `test_answer` function - because it starts with `test_`. It tells us it ran one test, and that test failed. We also get a traceback showing us the line and what the actual value was. We got 5 instead of 4. Depending on how much you know about programming, this error message should come as a bit of a surprise to you. The assert only gets a boolean, but pytest actually unpacks the `False` and gives us more information. --- # Example ```python # content of inc.py def inc(x): return x + 1 # content of test_sample.py from inc import inc def test_answer(): assert inc(3) == 4 ``` ![:scale 70%](images/pytest_passed.png) ??? So now we go back and fix our increment function, and run pytest again. This time, it tells us the tests passed. Does that mean the function is correct? No, but we could test more. How? We could do the same with more numbers. Tests are usually only necessary, not sufficient for correctness. Usually all test for a project are in a separate file or even a separate folder. In this example we had the test and the function to be tested in the same file, but that’s not good style. The actual implementation should be completely separate from the tests. --- # Test coverage .smaller[ .left-column[ ```python # inc.py def inc(x): if x < 0: return 0 return x + 1 def dec(x): return x - 1 ``` ] .right-column[ ```python # test_inc.py from inc import inc def test_inc(): assert inc(3) == 4 ``` ] ] .reset-column[ ![:scale 70%](images/pytest_coverage1.png)] ??? Here’s a more complex example, with inc.py containing two functions and test_inc.py containing the same test. I added an if into the inc function, and a dec function. If we run the test, they still pass. But clearly we’re not testing everything. In this example, it’s easy to see that we don’t test the if branch and we don’t test the dec function at all. If you project is larger and more complex, it’s much harder to figure out whether you covered all the edge-cases, though. That’s where test coverage tools come in handy. --- # Test coverage .smaller[ .left-column[ ```python # inc.py def inc(x): if x < 0: return 0 return x + 1 def dec(x): return x - 1 ``` ] .right-column[ ```python # test_inc.py from inc import inc def test_inc(): assert inc(3) == 4 ``` ]] .reset-column[ ![:scale 50%](images/pytest_coverage2.png) ] ??? So instead of just calling pytest, we specify --cov inc, which means we want to test the coverage of the inc module or file. Now we get a coverage report that tells us that out of the 6 statements in inc.py, two were not covered, resulting in 67% coverage. There are several ways to figure out which lines we missed, I like the html report the most. --- # HTML report .larger[ ```bash $ pytest --cov inc --cov-report=html ``` ] .left-column[![:scale 100%](images/pytest_html_report1.png)] .right-column[![:scale 100%](images/pytest_html_report2.png)] ??? So we specify --cov-report=html, and pytest will create a html report for us. We get an overview that contains the same information as before, but we’ll also get a detailed view of inc.py, which shows us which lines we covered and which lines we missed. And now we can clearly see that we never reached the return 0 line in inc and we never tested dec. Whenever your write tests, you should make sure they cover all cases in your code, in particular the different algorithmic pieces. Covering all error messages is possibly not as important. You should usually aim for coverage in the high nineties. If you look at projects on github, you can often see a badge that says “coverage X %”, showing the code coverage. These badges are actually automatically generated, and next I want to talk about how that’s done. --- class: center, middle # Continuous integration (with GitHub) ??? This is the magic of continuous integration, which I mentioned already last time. Continuous integration is a general paradigm in software engineering, of automatically running integration tests whenever you change your software. I want to talk about it in particular in the context of github, because that’s what you’ll likely be using, both here and later in industry – at least in a startup. If you go to google or facebook of amazon, they all have their own frameworks, but the same principles apply. They often have even more involved systems, like continuous deployment, which actually automattically puts new code into production if it passes tests. We're not gonna talk about that here. Continuous deployment gets somewhat trickier with machine learning algorithms. FIXME screenshot?! --- # What is Continuous integration? - Run command on each commit (or each PR). - Unit testing and integration testing. - Can act as a build-farm (for binaries or documentation). - requires clear declaration of dependencies. - Build matrix: Can run on many environments. - Standard serviced: TravisCI, Appveyor, Azure Pipelines and CircleCI ??? Ok so what’s continuous integration in more detail? It runs some sequence of commands on a cloud machine every time a commit is made, either just for a particular branch, or for each pull request. Usually the command is just running the test suit, but you can also check coverage or style or build binaries or rebuild your documentation. This is done on a clean cloud machine, so you need to be very explicit about your dependencies, which is good. You can specify a build matrix of different systems you want to run, such as linux and os X, different versions of Python and different versions of the dependencies, like older and newer numpy versions. There are some standard services out there that are free for open source and educational purposes, travis, jenkins and circle are some of them. We’ll use travis for this course. --- # Benefits of CI - Can run on many systems - Can’t forget to run it - Contributor doesn’t need to know details - Can enforce style - Can provide immediate feedback - Protects the master branch (if run on PR) ??? So what’s the benefits of doing this? You can run it on many systems, and in parallel. You might not have all the operating systems that you want it to run on, and you might not go through the hassle of trying out every patch on every machine. Because CI is automatic, you can’t forget to run it. There’s github integration that will give you a red x or a green check mark, and everybody will know whether your commit was ok or not. You can make a change to a package without knowing all the requirements and even without knowing how to run the tests. The CI will complain if you broke something. And you get immediate feedback while you’re working, because tests are run every time you commit! Most projects will only merge pull requests that pass CI, and that means the master branch can not break and will always be in working condition, which is important. --- # What does it do? - Triggered at each commit / push - Sets up a virtual machine with your configuration. - Pulls the current branch. - Runs command. Usually: install, then test. - Reports success / Failure to github. ??? Ok so let’s go through the steps that the CI performs. Let’s say you configured travis for one of your branches on github. If you push to that branch, travis will be triggered. A virtual machine will be set up with the configuration that you specified. The machine pulls your current version of the project. If it’s a pull request, it might also use the code that would be the result of a merge, so it would be your changes applied to the current project. Then some commands are run that you can specify. Usually these are installing the package, possibly including dependencies, and then running the test suite. After the test-suite runs, it will report back to github. --- # Setting up TravisCI .smaller[ - Create account linked to your github account: http://travis-ci.org
![:scale 60%](images/travis_activate.png) - Check out docs at https://docs.travis-ci.com ```yaml language: python python: - "2.7" - "3.4" - "3.5" - "3.6" - "3.7" - "nightly" # command to install dependencies install: - pip install -r requirements.txt # command to run tests script: - pytest # or pytest for Python versions 3.5 and below ``` ] ??? So in general to set up travis you have to create an account that is linked to your github account, and enable travis-ci on the repository you want. For public repositories and students its free, otherwise you have to pay for the service. For the homework you'll need to enable this for your repo. There a docs online at this website, and the only thing you need to do is create a .travis.yml file with the configuration. Here, we specify the versions of python we want to run, the install command, and the final command. This will run pip install with a requirements file. You could also run python setup.py install,or specify requirements here directly. --- # Using Travis - Triggered any time you push a change - Integrated with Pull requests - Try a pull request on your own repository! ??? Travis is run automatically each time you push a change, so in the homework, push a change and then see the status page. You can also do a pull request on your own repository, so you can see the pull request integration. It’s pretty cool and you should try it out. --- class: middle # Questions ?