{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Talks markdown generator for academicpages\n", "\n", "Takes a TSV of talks with metadata and converts them for use with [academicpages.github.io](academicpages.github.io). This is an interactive Jupyter notebook ([see more info here](http://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/what_is_jupyter.html)). The core python code is also in `talks.py`. Run either from the `markdown_generator` folder after replacing `talks.tsv` with one containing your data.\n", "\n", "TODO: Make this work with BibTex and other databases, rather than Stuart's non-standard TSV format and citation style." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [], "source": [ "import pandas as pd\n", "import os" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data format\n", "\n", "The TSV needs to have the following columns: title, type, url_slug, venue, date, location, talk_url, description, with a header at the top. Many of these fields can be blank, but the columns must be in the TSV.\n", "\n", "- Fields that cannot be blank: `title`, `url_slug`, `date`. All else can be blank. `type` defaults to \"Talk\" \n", "- `date` must be formatted as YYYY-MM-DD.\n", "- `url_slug` will be the descriptive part of the .md file and the permalink URL for the page about the paper. \n", " - The .md file will be `YYYY-MM-DD-[url_slug].md` and the permalink will be `https://[yourdomain]/talks/YYYY-MM-DD-[url_slug]`\n", " - The combination of `url_slug` and `date` must be unique, as it will be the basis for your filenames\n", "\n", "This is how the raw file looks (it doesn't look pretty, use a spreadsheet or other program to edit and create)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "title\ttype\turl_slug\tvenue\tdate\tlocation\ttalk_url\tdescription\r\n", "Talk 1 on Relevant Topic in Your Field\tTalk\ttalk-1\tUC San Francisco, Department of Testing\t2012-03-01\tSan Francisco, California\t\tThis is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!\r\n", "Tutorial 1 on Relevant Topic in Your Field\tTutorial\ttutorial-1\tUC-Berkeley Institute for Testing Science\t2013-03-01\tBerkeley CA, USA\thttp://exampleurl.com\tThis is a description of your tutorial, note the different field in type. This is a markdown files that can be all markdown-ified like any other post. Yay markdown!\r\n", "Talk 2 on Relevant Topic in Your Field\tTalk\ttalk-2\tLondon School of Testing\t2014-02-01\tLondon, UK\thttp://example2.com\tThis is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!\r\n", "Conference Proceeding talk 3 on Relevant Topic in Your Field\tConference proceedings talk\ttalk-3\tTesting Institute of America 2014 Annual Conference\t2014-03-01\tLos Angeles, CA\t\tThis is a description of your conference proceedings talk, note the different field in type. You can put anything in this field." ] } ], "source": [ "!cat talks.tsv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import TSV\n", "\n", "Pandas makes this easy with the read_csv function. We are using a TSV, so we specify the separator as a tab, or `\\t`.\n", "\n", "I found it important to put this data in a tab-separated values format, because there are a lot of commas in this kind of data and comma-separated values can get messed up. However, you can modify the import statement, as pandas also has read_excel(), read_json(), and others." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "
\n", " | title | \n", "type | \n", "url_slug | \n", "venue | \n", "date | \n", "location | \n", "talk_url | \n", "description | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "Talk 1 on Relevant Topic in Your Field | \n", "Talk | \n", "talk-1 | \n", "UC San Francisco, Department of Testing | \n", "2012-03-01 | \n", "San Francisco, California | \n", "NaN | \n", "This is a description of your talk, which is a... | \n", "
1 | \n", "Tutorial 1 on Relevant Topic in Your Field | \n", "Tutorial | \n", "tutorial-1 | \n", "UC-Berkeley Institute for Testing Science | \n", "2013-03-01 | \n", "Berkeley CA, USA | \n", "http://exampleurl.com | \n", "This is a description of your tutorial, note t... | \n", "
2 | \n", "Talk 2 on Relevant Topic in Your Field | \n", "Talk | \n", "talk-2 | \n", "London School of Testing | \n", "2014-02-01 | \n", "London, UK | \n", "http://example2.com | \n", "This is a description of your talk, which is a... | \n", "
3 | \n", "Conference Proceeding talk 3 on Relevant Topic... | \n", "Conference proceedings talk | \n", "talk-3 | \n", "Testing Institute of America 2014 Annual Confe... | \n", "2014-03-01 | \n", "Los Angeles, CA | \n", "NaN | \n", "This is a description of your conference proce... | \n", "