This guide walks you through how to put data online using data tool and the DataHub. By the end of this tutorial you’ll have this dataset published: https://datahub.io/examples/sample
Here we focus on tabular data and especially CSV - a universal basic format for structured data.
You can read and learn about CSV in this post.
Install the data CLI tool
Download and install the “data” CLI tool using our separate instructions.
Get some CSV data
We have a sample CSV file for this tutorial. Let’s get that file using our
data get https://datahub.io/examples/sample/r/sample.csv
sample.csv file in the current working directory. You can preview how it looks like:
data cat sample.csv
Publish the data
Note: you need to be logged in to publish data on DataHub. It’s simple and easy - just type data login and follow instructions.
Putting your data online is now just one simple command:
data push [path]
data push sample.csv
It will ask you to name this dataset - you can hit enter to use default name but in our example it’ll be “sample”:
❒ data push sample.csv
? Please, confirm name for this dataset: sample-lazy-yak-63 (sample-lazy-yak-63) sample
Next, you need to confirm title for this dataset - let’s hit enter here to use default one:
? Please, confirm title for this dataset: Sample (Sample)
The final output will be:
🙌 your data is published!
🔗 https://datahub.io/username/sample/v/1 (copied to clipboard)
Here is the GIF of the full process including login:
Note: by default, the dataset is public. Use unlisted flag to hide it in the search results or private flag to restrict access.
data push sample.csv --published
Your data’s online!
Now you can visit the showcase page of the data you’ve just published. It’ll take few moments to process the data:
Once your data is online share the link to it with your colleagues or friends so they can use it in their code! You can find more information about how to use data in our Getting Data tutorial - http://datahub.io/docs/getting-started/getting-data.