An open question, regarding a HUGE coding idea

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 3.00 out of 5)
Loading...
By Oliver (AKA the Admin) on 19 comments
in Categories: Just Talking

Hello !

This question is for coders. Crazy coders. If you don’t feel concerned, no problem, you can skip to any other post you like ^^

I have seen something impressive, on a private French forum dedicated to ebooks (maybe “CDL” rings a bell, otherwise : I am sorry but that’s as far as I feel I am allowed to tell, the rest is a secret that doesn’t belong to me).
In that forum, there is a field in the posts in which there is a perpetually kept alive download zippyshare link.

A script checks if the download link is dead, and if it is dead, it reuploads the file, and updates the html in the post with a new working download link.

I would really, REALLY like to have a feature like that on Hentairules.

Can you imagine this ? That would mean even posts that are seven years old and whose zip files have died for lack of activity or because their hosts are dead would have a chance to have a link brought back to life. Automatically, without requiring human time and intervention. How convenient that would be !!

–> Do you think that a system like that could be adapted to Hentairules ?

I’m providing more thoughts and information, below :

– As I am being self-hosted on my personal dedicated Debian server, scripting isn’t an issue, I can install new languages if needed (just don’t ask for stuff like .asp or languages requiring a paid licence, if that even exists, OK ^^).

– I have a private storage location for (almost) all my zips, reachable from the web, from which the script could pull the source files. From there, either the script manages to manage uploading the file to the hosts (copy it to ~/tmp and upload it like a visitor), or it manages to launch a “remote upload” (all file hosts have that option now, it’s telling a file host “please fetch that file from the given location, wget it and place it on your servers, thanks”). In all cases, that implies to behave like a browser in front of the server of the file host, to wait a few minutes, and come back to check for the list of the files in the account.
I know that sounds unrealistic, but thing is, I’ve seen a place where that works O_o

– I would appreciate if the script could post&fetch the file to one of my usual hosts with my login:password credentials (and, in the worst case, the only working example I ever saw was with zippyshare ^^)

– For the part about now updating the posts with the new zip links, I’m in the fog. The question is, how does the script know what blog post to update, and how can it bring that update to that blog post precisely, instead of bringing it to another blog post instead… A blog plugin like this can insert html snippets into posts, sure… Or I can also hard-code a php/html call into the page where there are the blog posts, to something the script would be writing. No idea.

– There is also the case of blog posts where more than one zip file is offered, the easiest solution is to simply care about the first zip file offered on the post, and disregard the following possible zip files that would be offered

Honestly, I think this is *too* complex to write.
Or, more accurately : absolutely, unescapably, totally far too complex :D
Each part would be doable separately, with skill and lots of time, but to make it all work together, and to find how to fix posts with their matching new zip links, whoah.
But, hey, maybe I’m wrong, and one of you guys could do it ! :D
I risk nothing in asking, don’t I ;)

So there we are. I wrote the post to ask.
Not just for my blog, imagine if a script like that allowed file sharers to maintain their links alive, how cool that would be !
thank you VERY MUCH if you’ve got something to tell about it ;)

And no hard feelings if you fear you’re going to crush my hopes, even I, I don’t believe that’s doable ^^

Subscribe
Notify of
guest

19 Comments
oldest
newest most voted
Inline Feedbacks
View all comments
FSK
FSK
9 years ago

Nope, I briefly looked into it. Depositfiles does not appear to have an API that supports automatic uploading of files.

Actually, 1TB of storage and the corresponding bandwidth is cheap nowadays. Why not put the files on your own server, and find a partner to serve the ads?

Oliver AKA The Admin
Admin
9 years ago
Reply to  FSK

Because I would have to host a horribly complex file script that would mean having to manage even more, more, MOAR gigatons of things.

While, on the contrary, my biggest problem is that I don't have enough time on my hands.

And with everyone downloading from me, I'd need another server, thus another money spending. Frankly, that'd be just too complicated T__T

oldbrokenhands
oldbrokenhands
9 years ago

Oliver forgive me for my brutal honesty, but I feel it's time for you to hear it.

First of all, you are too big now to be using upload sites, yes it's convenient when you have a few posts on a forum and it keeps you from having to host a website, but as you're seeing it's a nightmare when you have gigs of data and almost a thousand posts.

Secondly, you would not have enough time on your hands, if you were doing this alone, but as you've posted in the past you can probably rustle up a few good technical assistants to help relieve you of some of your burden.

From what I've been seeing, there are a lot of decent cloud services opening up these days that have some fairly simple code to learn to host your cloud storage webpage.

Now I understand if you're wary of having some of these cloud services bar or stop you from hosting adult content on their servers, or of losing control of what content is hosted on the site, but at least it's worth a long hard look and consideration.

I guess the summary of what I'm saying is, cloud hosting may not be as hard as it seems and you should not have to do this alone.

ayle
ayle
9 years ago

As FSK said, your problem is not the script. It's the APIs. I think if there hadn't been such a crackdown by the MAFIAA, depositfiles and the likes would have released APIs that allow to automate file uploading but with the piracy situation being what is …. Yeah….

hub0083
hub0083
9 years ago

As others have said, finding a place to host your files is the real issue. One poster said Depositfiles doesn’t have a public API, and if so, you’re kind of stuck. Either you can try and pick a part the manual upload process and see if you can automate it, or you have to go w/ a different hosting service.

As for the script itself, it sounds like you want a script tied to a cron job that checks to see if the last known good link for a file is dead, and if so, then to re-upload it, and then update the appropriate post. I can help you w/ the scripting portion if you’d like – I’m mostly into python nowadays but I’m good w/ PHP. Don’t even want to try and attempt this w/ bash…

Yv@n
Yv@n
9 years ago

While having a public API would be handy, its not really necessary by looks of it. Users logged in can upload files without the need of solving a captcha, so its only a matter of creating a simple http client, that posts some form data to dfiles and parses the result.

The hard part would be for the script to identify the files you have at your storage after it found the dead links in posts, also if you want this with every upload service, things can get quite hideous really fast.

Jungy
Jungy
9 years ago
Reply to  Yv@n

Seems to me storing the info in a flat structure like a hash would work
{ /path/to/file -> http://upload_link
…}

Creating new posts, you would reference the key and the server would use that value to dynamically provide the link. This way as the links are updated, the website updates automatically too, without needing to comb for dead links on each post, just for each link.

Then you could just have a separate cronjob process to iterate through the links and update on them as needed.

Only downside is my suggestion is rather impractical. :/

Good luck!

Gamon
Gamon
9 years ago

Oliver,all you need is take a contract(face to face) a really,really toung tech-nerd to ask or help all werido stuff like tech to make better your site and download,Pics etc…in a hi-tech internet is so fast and quick "upgarde" internet :D lol

bob
bob
9 years ago

I kinda wanted to do somethink like this, love to give you a hand.

for link checking do you want it to be user submitted eg user finds a dead link clicks a report button and that will auto reupload the link (this will be the least taxing way to do what you want), or do you want to weekly test all link statues and auto reupload dead links?

also after a quick look found a service that can help you with reuploading to file hoster and has an api
this may be of interest to you
http://reupload.it/ http://reupload.it/faq/

Fuura
Fuura
9 years ago

Hey, starting March I have an internship to do for my IT studies, and I still haven't found a place to go, would you hire me ? :D

Huhu, just kidding. Actually, I don't know much about file hosting and all the stuff, and I don't which language would be most appropriate. :p

Anyway, hope you'll find something !

(That said, I'm really looking for an internship and I'm seriously in the shit if I can't find anything…I won't graduate and get my degree, and my year will be lost haha…Si tu vis en France Oliver, tu dois savoir qu'en plus, avec la rémunération, c'est un peu la misère pour trouver un stage, surtout quand t'es en licence professionnelle ^^):)

Oliver AKA The Admin
Admin
9 years ago
Reply to  Fuura

Ah, oui, la galère. A moins d'accepter de payer à l'oeil :(

Fuura
Fuura
9 years ago

Bah honnêtement, c'est tellement chiant de trouver un stage qu'à la fin tu serais presque près à accepter de pas être payé…mais bon, de toute façon on a pas le choix, c'est un stage de 4 mois donc légalement on doit être rémunéré. Les entreprises peuvent rechigner autant qu'elles veulent, si elles veulent un stagiaire elles pourront pas se dédouaner normalement.

Mais après, je suis désolé je trouve quand même pas ça normal que les entreprises refusent de payer des stagiaires, là c'est quand même 4 mois, c'est pas rien. On bosse quand même, on est pas là pour visiter. Alors peut-être que la rémunération est trop élevée, mais en même temps si on laissait faire, soit on aurait rien, soit on aurait une gratification de 100€…j'ai peut-être un peu l'air de râler, mais 'faut pas déconner non plus…enfin c'est l'éternel débat de toute façon ! Toujours est-il qu'en attendant, bah nous on reste sur le carreau et après on vient nous dire que dans certains certains (dont l'informatique apparemment), c'est pas les offres qui manquent pourtant. :D

JamieWolf
JamieWolf
9 years ago

Hey ya Oliver, everything is possible even without an API, but you then should consider to recunstruct a little more than just add a script, like Wordpress post types etc. etc. which would help to automate things. I own and manage a couple of servers myself and if I where you I would stay with the one-click hosters. Bandwidth may be cheap, but I guess your site has quite a few hits and you would run out of bandwidth soonish (at least in the first months when everyone downloads like a little monkey)

Otherwise If you really plan on doing some coding, go for github as development platform and get other people involved. I guess a few talented code monkeys are reading this post, and a few hours of hacking some code is what they are willing to do, but not alone as a project which is too much.

Daemoniak
Daemoniak
9 years ago

Hi Oliver,

Interesting question. I know zip about file hosting, however I do code for a living, so I can answer part of it.

How to keep the posts up-to-date? The simplest way is to have a database "(file, hosting site) -> url" and to dynamically retrieve the links from that database. Actually, it could even be "file -> list of (hosting site, url)" and the list of files be built automatically. Note that the database itself does not know which blog post talk about the file; instead it is the other way around and the post know which files it has.

How to check that the url is not dead? Well, indeed a simple cron-tab could iterate over your database table and check each url in turn… providing that you actually manage to automate the check (each hosting site will have a different way of doing it). Note that the automation can be tricky, it would be far simpler to just have people notify you…

How to re-upload? Once again, it really depends on the API offered by the various websites you wish to re-upload to…

Have you thought about a semi-automated system?

If everything cannot be automated, you need not abandon hope: automate what you can!

The first thing to automate is the generation of links in your blog posts, for everything starts with a table of files to urls. This can be done on your website, whatever language you use. You will also need a script to convert the existing blog posts to the new format (and recover the files/urls used in them).

Automating the check is the obvious second candidate; but it is not quite necessary. Instead you could simply integrate a "report dead upload" in the links generation and save the reported links in a table in the database (or use the same table and mark them as reported dead). If you are capable of automating the check, it is easier, but otherwise a human being (admin) can be involved to actually check the link. You might need various countermeasures if people start (ab)using the report functionality and mark all links as dead; a grace period (last verified 2015-01-03 => next check 2015-01-13) is the easiest, though not the only one (if the check is automated, it becomes less necessary).

Similarly, while re-uploading automatically sounds neat, it can be delegated to a human being (admin). Once said admin has realized that a link is dead, he can be provided with a simple web-page giving him where the file is stored on your side (url) and a link to the upload form for the website to upload to; then once uploaded he can copy/paste the new url into this page to automatically update the database (and all future versions of the blog post).

If you have any question feel free to reach to me by e-mail, I really appreciate your website and would gladly help in return.

oldbrokenhands
oldbrokenhands
9 years ago

Okay so looking at the posts, it's not an impossible coding project, just one that would require multiple methods and a few classes if you wrote it in something like Java.

Pseudocode would look something like this:

while (not end of urlList)
{
url = urlList[x];
if (url.status == false)
{
reupload(url.file);
update(url.post);
}
x++;
}

-urlList would be an array of url objects with the following fields
-status (whether the link is dead or not)
-file (the filename associated with the url upload)
-post (the HR post that was generated earlier)

As Daemoniak and others have stated the best way to build this object would be as a database, and using queries to check on the fields of the object that would guide the automated script.

It's not impossible, but has separate parts. It would depend on building methods for accessing each of the fields in the object and changing/updating the database accordingly.

In other words divide and conquer the tasks, have two or three folks working on it and you have a viable automated program for maintaining your posts.

bakabaka
bakabaka
9 years ago

Use plowshare to automate uploads to download sites. It can also check if links are alive. It supports a few dozens of sites and is regularly updated. You can store your accounts credits in a text file for fully automated uploads.

You still need to write a script that parses your DB, extracts links and check/renew them, but that should be easy with basic programming knowledge. Unfortunately, plowshare is coded in bash, with no API, so you’ll need system calls to use it.

ChileanGuy
ChileanGuy
9 years ago

It seems there's an API available at Dfiles:
http://stackoverflow.com/questions/23472812/have-

SomeDev
SomeDev
9 years ago

Mega has an API, officially only documented in C++: https://mega.co.nz/#dev
However, its seems that it also accepts HTTP/JSON requests, as this guy posted full examples in Python and PHP, even anonymous uploads etc: http://julien-marchand.fr/blog/

Joda
Joda
9 years ago

i am pretty certain i could do this.

what i see as a potential for unreliability would be finding the exact files you have in your archive from the name of the post title.

also, can't seem to find a bulk link checker on the depositfiles page, so link validity checking would have to be one by one (still easy to automate tho).

depositfiles uploading should be possible to automate easily, especially since they even offer ftp and upload, and otherwise if thats problematic i could also just use their normal webform or whatever they use.

getting and editing posts should be possible on the database directly and thus poses no problem.

the language of my choice is python2, if you are interested, leave me a message on irc, efnet, #xdxdxd