Categories
Blog post

Running a cheap server

I’ve been running my own node on the fediverse for a little while now: pleroma.gidikroon.eu. That is mainly a hobby, but partly to learn from it as well. I had not done sysadmin before or set up my own server. This WordPress blog is one of those one-click affairs. I also believe that part of the promise of the fediverse is to be able to self-host your social media presence while staying in contact with others on other servers; I wanted to experience how that would work out.

If a server is only for personal use, you don’t need much resources. Some people install the software on a PC at home that they leave turned on. Others use a Raspberry Pi, a very cheap single board computer that you can get for under €50,-. Cheap cloud providers like Digital Ocean and Hetzner are popular too. There are also services that can run your fediverse server for you, like masto.host and spacebear.

I wanted to try using Amazon Web Services (AWS). The by far easiest setup as well as the cheapest (within AWS) is using Amazon Lightsail. The smallest server option is only $3,50 per month with everything included. If you calculate doing the same thing on Amazon EC2 (having a t3a.nano instance running full time with block storage, network traffic and an ip address) it is more expensive and you have to setup everything yourself. Lightsail (and Digital Ocean and Hetzner and others) are really easy and cheap for people who don’t want to spend a lot of effort in self-hosting something.

On this smallest server option I have run at the same time:

  • My Pleroma server (an ActivityPub server)
  • The PostgreSQL database
  • Another development Pleroma server (to test stuff)
  • A relay that receives and sends posts between fediverse servers

The Pleroma server was also subscribed to one of the busiest relays, so posts came in at a rate of about three per second, around the clock. I had however stopped the second Pleroma server a while ago.

So what are some of the things I learned so far? The main thing relates to cloud servers being ‘burstable’. This means that they can only run at full speed for short periods of time (up to around half an hour), after which the speed gets reduced to 5% of full capacity. Which is perfect for personal servers that only get visited occasionally. Not so much for a server that gets constant traffic 24/7.

Amazon is clear that you don’t get full capacity. In fact these are recent metrics for my server where they clearly indicate a very low sustainable zone:

At the end of the graph you can also see that after my server has been above the sustainable zone for over a couple of hours, it has now been put into what I call cpu jail: the server will not do more than 5% of its full capacity. The problem is that other servers, relays, etc keep talking to it and when their connections fail, they will just retry later. The load been placed on the server will not let up and so this cpu jail situation will not resolve itself.

(BTW you can see that my approach to sysadmin is very low-effort, since I’m continuing to type this blog post while the above graph is a real current picture of the server that’s locked up. It needs to be a hobby after all and I’ll look into it when I feel like it.)

In the graph you can also see that early in the morning the server had just crashed after a similar period in cpu jail.

What to do about it?

For now I have stopped the relay server and unsubscribed from the other relay. This should mean that (after a while) the only social media posts coming in will be of people I actually follow. This takes a while to take effect however, as most servers are still retrying their earlier failed deliveries.

I have also looked into the configuration of the software. By default it is configured for medium sized servers that have dedicated resources. These have multiple cpu’s that work in parallel. The software uses that by queuing background jobs and having several worker threads in parallel dealing with the queued jobs. The standard configuration has about five different queues with up to 25 parallel workers each. That is not suitable for a cheap burstable cloud server.

Remember that I get only 5% of one cpu. Not 25 full time cpu’s…

So I changed the configuration to have only one worker per queue. That is still too much, but I can’t configure it lower. I have also reduced the amount of parallel connections to the database (which was 10), for the same reasons.

The database. That’s the next thing.

First the easy thing: with the cheapest server you get 20GB of storage, which is more than enough to keep your database on. However, since I subscribed to a relay and kept the posts for three months I essentially had everything published on the whole fediverse over the last few months in my database, which was about 15GB. It turns out that PostgreSQL always needs to have space for double the size of the database for its management operations. So I attached an extra disk within Lightsail for all data, so it would also be easier to move between instances. Costs some extra, but makes stuff easier.

A while ago I decided to upgrade the database from version 11 to 12. Seemed easy enough, since there is a command in the Debian operating system that does everything for you:

pg_upgradecluster 11 main

Doing this it will automatically create a new version 12 cluster and use the ‘dump’ method (also used for backups) to stream data out of the old cluster straight into the new cluster. Sounds fine.

But the ‘burstable’ concept also applies to reading and writing data on a disk. Again, Amazon is clear about this and says that you can’t really load more than 10GB of data into a database on the smallest server options.

So after this had been reading/writing with about 100MB/s for a while, the speed dropped to about 100kB/s. I left it running for over 24 hours, but in the end the process just crashed. The ‘pg_upgradecluster’ nicely rolled back all changes and everything was operational again, but still on version 11.

Reading up on things I found that I should have added an extra option:

pg_upgradecluster --method upgrade 11 main

With this extra option, it would not use the ‘dump’ method, but an ‘upgrade’ method to change the data from the version 11 format to the version 12 format. With this, the process was done, successfully, in five minutes, rather than unsuccessfully in 24 hours.

I realize however that if I ever need to restore the database from a backup, which necessarily uses the ‘dump’ method, I will run into the same problems.

It seems that theoretically you can run the Pleroma software on the cheapest cloud server option, but there are limits to what it can do. I am glad however that I used Pleroma and not Mastodon, since it requires much more resources to run and is not really suitable for self-hosting.

The learning continues.

Categories
Blog post

Tonight the episode of Van der Valk with Stephanie Leonidas will be on UK TV. 20:00 UK time on ITV.

I’ve seen this episode because it was as a preview on Dutch TV on New Year’s Day. Certainly worth a watch, but that is everything with Stephanie in it…

Categories
Blog post

Louise and band (from home)

This is so fun, Louise and her band performing Escapade from home.

https://www.instagram.com/tv/B_aYyrahVbq/?igshid=1wj6idv1houja

Categories
Blog post

Happy Birthday Ryan Newman!

View this post on Instagram

feeling twenty-two ✌🏼✌🏼

A post shared by Ryan Whitney Newman (@ryrynewman) on

Categories
Blog post

I like the concept of using embeds of external content on your own site. This way you can show what you’re interested in or what your message is, like you would when retweeting, but your audience is not tied to one provider like Twitter or Instagram.

However, importing these posts like many people do is something I want to avoid: the posts should remain owned by their authors, they should be able to delete the post, and likes, views, etc should count towards their posts. Embedding provides the solution to this. The content is loaded from the external site (e.g. Twitter) and if an author has deleted the post, the embedded block will remain empty.

When a user on a federated service posts a message, the expectation is that it will be shared with other servers. Just like when you email someone, you understand that the message leaves your computer and possibly goes to another provider. However, if you tweet or post on Instagram, you would be rightfully annoyed to find your post copied onto someone else’s website, even after you have maybe deleted your post.

I also wanted to find out whether posts with such embeds showed up correctly when published to a federated network using the ActivityPub protocol. It doesn’t yet do so… Whether it’s a tweet with a video or text only, or an Instagram post, they show up mangled in the fediverse.

What would be ideal is the published, federated post containing nothing more of the embedded object than the link. Receiving servers could then display the embedded object themselves, if they so choose. A bit like how Friendica shows imported tweets.

Categories
Blog post

Another experiment is to start a page of embedded Instagram posts that I find interesting. I’ll be adding to the page, or maybe I’ll convert it into posts. Anyway, at the moment it is mainly her:

Categories
Blog post

Started a Tweet Cage, mainly to embed Tweets by others that I think are important to keep. But of course I’ve also included a tweet by me that somehow got 6000+ likes…

Categories
Blog post

Still excited that Chloë Grace Moretz once answered my question in an Q&A… Even though it was over a year ago…

Categories
Blog post

Deleted much

I’ve now mostly removed all those posts imported from tumblr that were just a reblog of someone else. That’s more than 1500 of them. These shouldn’t have been imported in the first place.

Next task in the tumblr import mess cleanup is to fix the posts that are broken. Especially videos didn’t come through correctly, so I will just re-upload them again. But also photo sets aren’t ok.

In the mean time I’m trying to add microformats2 and other meta tags to this blog in a child theme. The existing plugins don’t quite do it the way I want. The eventual goal is to be able to have a better integration with Twitter and others using bridgy.

I also have plans to use categories more to separate subjects as if they are separate blogs, with an overview on the homepage. You may notice the homepage is currently very sparse, but that’s because the content it’s going to point to is not ready.

Categories
Blog post

Jenna supports WE on Giving Tuesday

Jenna Ortega posted this photo of her in Kenya on her Instagram today. She visited there about a year ago to support the WE charity. Any donation made to that charity today on Giving Tuesday will apparently be amplified 10x, so please think about it (US/Canada only it seems). See www.we.org/donate.