Over time, Lemmy instances are going to keep aquiring more, and more data. Even if, in the best case, they are not caching content and they are just storing the data posted to communities local to the server, there will still be a virtually limitless growth in server storage requirements. Eventually, it may get to a point where it is no longer economically feesible to host all of the infrastructure to keep expanding the server’s storage. What happens at this point? Will servers begin to periodically purge old content? I have concerns that there will be a permanent horizon (as Lemmy becomes more popular, the rate of growth in storage requirements will also increase, thereby reducing the distance to this horizon) over which old – and still very useful – data will cease to exist. Is there any plan to archive this old data?

  • ubergeek77@lemmy.ubergeek77.chat
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    1 year ago

    Hey, that’s a Vultr guide! I use Vultr, thanks!

    By the way, how are your costs on EC2? My understanding is that hosting on EC2 would be cost prohibitive from data transfer costs alone, not to mention their monthly rates for instances are pretty much always below the cost of a VPS.

    Now if only someone could do this for the Postgres data. I wonder if S3FS would be able to handle the load of a running database, that would be a nice way to save costs.

    • Toribor@corndog.uk
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Currently I’m just running a single user instance on a t2.micro. I’ve definitely locked it up at least twice after subscribing to a big batch of external communities so it’s definitely undersized if were to open it up to more users. I only have one other small service running on that instance though so Lemmy is definitely using the bulk of that capacity at least when it’s got work to do.

      Costs are about $11.25 a month for the instance and about $2.50 for block storage (which is oversized now that pict-rs is on S3). I’m guessing that pict-rs s3 costs will be just a few pennies a day unless I start posting a lot on my own instance, probably less than a dollar a month.

      Data transfer costs for me are zero though. I’m not using a load balancer or moving things between regions so I don’t expect that to change.

      • dudeami0@lemmy.dudeami.win
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        As for the data transfer costs, any network data originating from AWS that hits an external network (an end user or another region) typically will incur a charge. To quote their blog post:

        A general rule of thumb is that all traffic originating from the internet into AWS enters for free, but traffic exiting AWS is chargeable outside of the free tier—typically in the $0.08–$0.12 range per GB, though some response traffic egress can be free. The free tier provides 100GB of free data transfer out per month as of December 1, 2021.

        So you won’t be charged for incoming federated content, but serving content to the end user will count as traffic exiting AWS. I am not sure of your exact setup (AWS pricing is complex) but typically this is charged. This is probably negligible for a single-user instance, but I would be careful serving images from your instance to popular instances as this could incur unexpected costs.

      • ubergeek77@lemmy.ubergeek77.chat
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        1 year ago

        Just FYI, you could save about $5 a month and get 2x the performance if you moved that to a VPS not on AWS. $11 a month for t2.micro, especially if it’s locking up, is basically you being scammed if I’m being honest 😅

        AWS isn’t really designed for long running tasks like this unless you get a long term commitment discount. It’s intended for enterprises and priced that way. For a hobbyist like you, I’d definitely recommend Vultr or something.

        Also, be careful about those bandwidth costs. Most of the time it’s never free to serve data out like that. You may not be using a load balancer, but double check those bandwidth costs, I remember something about paying for bandwidth I didn’t expect.

        Definitely consider moving to a $5 or $10 instance on Vultr, they have block storage too. You could either save money, or spend the same for 3-4x the performance.