SteemYaLater: Your Steem Blog Image Backup Solution!

about 4 years ago (Edited)

Automation to the Rescue!

This Python 3.6 script uses the Steem Beem Library and a variety of methods to archive your Steem Blog images as well as markdown files. It used image hash verification to ensure that files are downloaded once saving valuable storage space.

Repository

https://github.com/anthonyadavisii/SteemYaLater

Version 2.0 Change Notes

Added PyCurl download method to address issues w steemitboard images
Data Deplication enabled: Prevents redownload of file if already exists in folder structure. Symbolic link with relative path placed instead saving valuable storage space.
Logging and CSV output: Session log file is produced in working directory. Output CSVs are created for each account so users may readily see what failed and may require manual action.

Version 1.0

Version 1 was the basic framework with wget. We don't talk about version 1 anymore.

I've worked hard and made a ton of progress in order to give my fellow Steemians a way to save their priceless data.

Roadmap

Steem Blog Backup as a Service
@dtube thumbnail support
Upload to Skynet web portal

Known Issues

DTube thumbnails will not download as they are not stored within the Beem Comment json_metadata image property. Logic to be added to accomodate. Also, some links may require escape characters. These will be addressed as time permits.

Uses Python 3.6

Install Prerequisites

# PyCurl may require the following packages be installed.

sudo apt install libcurl4-openssl-dev libssl-dev

# Python modules installation

Python 3.6 -m pip install beem
Python 3.6 -m pip install wget
Python 3.6 -m pip install urllib3[secure]
Python 3.6 -m pip install pycurl
Python 3.6 -m pip install certifi #may or may not be needed if the [secure] option is used for urllib3

Execute Script

python3.6 SteemYaLater.py

Prompts for Steem User. Alternatively, you may populate the accounts list variable with users to backup

Account to Backup? anthonyadavisii

Script will crawl your blog_entries filtering out resteems (reblogs)

Will then cycle through each blog_entry, save the body to a .txt files, and grab any images it can with wget or urllib3

Feel free to reach out if you need help! If you appreciate the work, consider sending me a tip!

How to put your FREE Downvotes to work in 2 easy steps!

Learn more!

This post was created using the @eSteem Desktop Surfer App.

They also have a referral program that promotes users to onboard to our great chain. Sign up using my referral link to help support my efforts to improve the Steem blockchain.

Ditch Partiko and get eSteem today!

PlayStore - Android	Windows, Mac, Linux

AppStore - iOS	Web

hive-136515 steemyalater steemdev backups blog python stemgeeks

0.000

27 comments

@luegenbaron 65

about 4 years ago (Edited)

Finally!

Thanks Bro!

@tipu curate

0.000

@anthonyadavisii 73

about 4 years ago

NP. Just added another check on image hosts for DNS lookups. Ideally, I'll store unresolved hosts in memory and skip them if they were not resolved previously. Due to the pyCurl settings, the script tends to get hung up for a while on timeouts. It works but definitely more optimization needed.

Appreciate the reblog. Hoping to have a full-fledged service up soon but need to work out my storage solution atm.

0.000

@steemitboard 66

about 4 years ago

Congratulations @anthonyadavisii! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

	You published more than 450 posts. Your next target is to reach 500 posts.

_{You can view your badges on your Steem Board and compare to others on the Steem Ranking}
_{If you no longer want to receive notifications, reply to this comment with the word STOP}

To support your work, I also upvoted your post!

Do not miss the last post from @steemitboard:

	Downvote challenge - Add up to 3 funny badges to your board
	Use your witness votes and get the Community Badge

Vote for @Steemitboard as a witness to get one more award and increased upvotes!

0.000

@freebornangel 72

about 4 years ago

Can you set it up to email pdf's?
Or provide a link to download a PDF?

0.000

@anthonyadavisii 73

about 4 years ago

What I'm doing currently is zipping up the backup on my Ubuntu box, transferring that to my cloud provider, and then providing a shareable link.

It's a few too many manual steps at the moment but plan to automate some of the steps such as having the script take care of compression. You're update should be next up btw

0.000

@freebornangel 72

about 4 years ago

Sweet, I would hate to be wiped from the only history likely to be written by me.

0.000

@abitcoinskeptic 75

about 4 years ago

I'm not going to try this in current form due to an adversion to the technical nature, but I do like the road map.

Your efforts are greatly appreciated.

0.000

@anthonyadavisii 73

about 4 years ago

I look into putting all the dependencies into a docker container to simplify use and so users on non Ubuntu operating systems can use. Thanks for the sentiment. Glad to help!

0.000

@abitcoinskeptic 75

about 4 years ago

I'm running linux minty on a netbook I rarely use...would that manage this?

0.000

@battleaxe 70

about 4 years ago

Amazing work, thank you

0.000

@drakos 70

about 4 years ago (Edited)

There are errors in your install prerequisites
Python 3.6 should be python3.6

Also, I had a problem installing pycurl, so do this first:
sudo apt install libcurl4-openssl-dev libssl-dev
then
python3.6 -m pip install pycurl

Another problem when I run it:
FileNotFoundError: [Errno 2] No such file or directory: '/home/drak/SteemYaLater/Backups/drakos'

So I created that folder manually, then I run the tool again, it fetches the posts but when it starts downloading them, it spits out binary characters on the screen. Did you test this tool properly?

0.000

@anthonyadavisii 73

about 4 years ago

I forgot about that. Yes, that was also encountered during my setup but it slipped my mind. I'll update the readme. Thanks, @drakos!

0.000

@reverendrum 62

about 4 years ago

Terminal is spitting out garbage but it's backing up some of my posts at least after I made the backup folder.

0.000

@anthonyadavisii 73

about 4 years ago

Good catch! Updated repo

https://github.com/anthonyadavisii/SteemYaLater/commit/172a3883a29b40fed8b70d5124e6293e66084439

0.000

@reverendrum 62

about 4 years ago

My man! For those who didn't know, he's been working diligently on this for a while now. He deserves more than the 125 upvotes I'm seeing right now.

0.000

@anthonyadavisii 73

about 4 years ago

Updated again to inject header data to steemitimages.com. Recommend increasing pauseTimeInit due to suspected rate limiting.

0.000

@mathowl 66

about 4 years ago

@dexterdev

0.000

@dexterdev 63

about 4 years ago

Thank you @mathowl

@scienceblocks see this. maybe of help to you

0.000

@bobinson 67

about 4 years ago

Really appreciate coming up with this! Going to test now.

0.000

@anthonyadavisii 73

about 4 years ago

May want to get the latest as I made a few tweaks and addressed a couple issues.

0.000

@bobinson 67

about 4 years ago

I ran into some delimiter issues/erros while trying to run. I have raised an issue. Will check whether I can fix (Im stuck with this bloody Bitcoin and ZMQ crash at work!)

0.000

@gorans 38

about 4 years ago

Cool

0.000

@engrave 73

about 4 years ago

Unfortunately, the newest version from Github does not work. Any chances to fix that?

  File "SteemYaLater.py", line 187
    continue
    ^
SyntaxError: 'continue' not properly in loop

0.000

@anthonyadavisii 73

about 4 years ago

Thanks for the heads up. Will take a look.

0.000

@engrave 73

about 4 years ago

Let me know when it's ready, I would love to contribute by dockerizing it.

0.000

@anthonyadavisii 73

about 4 years ago

Removed those continue statements. Should be good to go now. Putting it in docker would be really helpful. Thanks!

0.000

@engrave 73

about 4 years ago

Thanks, I will do a PR tomorrow.

0.000