2.1 What is a data server?
I use all my data tools on Ubuntu – which is a Linux operating system – and I suggest that you do the same.
I use a Mac as a notebook, but you can have a PC too. In this case it doesn’t really matter, because we won’t install Ubuntu on our computer, but access it via the internet.
What we will do here is to connect to a remote server – we’ll type commands and make the remote server run the data analyses instead of our local computer.
This is how you can imagine it.
You type in a command on your computer. Something like: “Summarize this list: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10”.
When you hit enter, you are sending the task to your remote server. It processes the task and then sends back the result, printed on your screen: “55”.
(Note that you can set up Ubuntu on your personal/work computer too, if you really want to. But this is something we usually don’t do in real life, because with that solution we would limit our data processes to our computer’s capacity. We would also lose some cool features.)
If you use a remote server for data analysis, you will be able to:
- Access your data infrastructure from any computer with a username and a password (even if you lose or break your personal notebook or something). Don’t worry, nobody else will able to access your data on your remote server – this is completely private.
- Automate your data scripts (e.g. make them run every 3 hours, even if you turn off your notebook).
- Scale your stuff. You won’t be limited to your computer’s capacity. Renting a few more processors or memory is just one click away if you are using a remote server.
- Use Ubuntu without installing it on your computer. This is a key point! If you use a remote server, you can’t possibly mess up anything on your own computer. That’s not a bad thing after all.
You have to know that a remote server usually costs money. The cheapest solution I know of costs $5 per month. For the service I know, you can get a free-for-the-first-two-months coupon, which is a nice deal, but if you look around on the internet, you might find even better deals. (I didn’t.)
It’s important to know that I’ve been creating all my data coding tutorials (videos and articles) using the exact same data stack that I’ll show you in this video course. So if you are about to do more courses and tutorials on data36.com, it will be really handy for you if you do everything just like I do in this video course. To make everything work properly, please be sure to watch all the videos in order and do the steps in the exact same sequence.