I do not understand why your files would disappear. But, it might be that you were loading files that had strange code in them that was causing other issues.
As you see by how small amount of code we now have working, you can do a lot if you just think out the logic of it ahead of time. We only use about a dozen lines of code to do this entire process. I have code that scrapes things off of sites like NFL scores and NFL schedules for my football pool. I have pulled weather reports out for others who needed just a certain area. Scraping is hard sometimes, but, for something like this use, it is easy enough… Let us know of your results of your testing… We can help if it does not!
In all my years working with computers, I have never seen a case where you save a file to your hdd, it has a file size, and then later when you open it the file disappears, or the page opens but when you quit out of your browser the .html file disappears.
I added an “Article name” field to this page, so that when your script runs, the filename is already named to the article name which saves me a few key strokes.
I’m sure there is more that I can do to pretty things up, but for now I think this solution will do.
Now I have about 200 web pages I have to resave…
But THANK YOU for all of your help, and saving me lost of wasted time!! (I was supposed to spend today getting my head back into a coding project from a few years ago, but I needed to get this fixed first because every day I save 5-10 articles and I am soooo backed up right now!!
If you are willing, I definitely could use some more help as I get back into web development.
Never thought I’d see the day where I would even forget HTML coding, but my mind is getting old and brittle!!!
Hopefully I can get back to where I was 3-4 years ago, when I think iw as pretty good at things.
You know, I was thinking out files disappearing… It could be because it was saved as a temporary file.
Those are erased after use. But, I would have to see the code you used for it. But, kind of not important now, I guess…
No problem! Glad to help you! And, there are several others on this site that are aces, a few much better than me. So, when you get your next issue, create a new post and one of us will help you. I am here a lot as I just love helping everyone with their programming puzzles…
[quote=“ErnieAlex, post:23, topic:27724, full:true”]
You know, I was thinking out files disappearing… It could be because it was saved as a temporary file. Those are erased after use. But, I would have to see the code you used for it. But, kind of not important now, I guess…[/quote]
I went into Firefox and surfed to a web page, decided to save it, choose File > Save Page As, gave it a name, and saved it to my computer. Then I went to Finder, and the file was there with a file size. There was a my-article-name.html file and a my-article-name_files folder. I would double-click on the .html file and either it would instantly disappear, or the page would open okay, but once I quit out of Firefox the .html would diappear before my eyes.
You know, I am listening to so good rock-n-roll now late at night, working away on my code, and forgot how much I missed doing this stuff on late Saturday nights!
It will take me some time to get my head back into LAMP, but the good thing is that I have a kick-ass code base I wrote several years ago, plus I document the s*** out of my code, so that means I have a super-duper guide to getting my head back into LAMP and my website!!
At the same time, my brain is getting old and brittle, and it would be nice to have some coding buddies online who are happy to help. (Sadly, a lot of people on these online coding forums can get rather nasty if you ask for help, but clearly you are of a different mold, and that is good to know!)
If this is only for reading later I strongly suggest you just use one of the tools available for it. They do a much better job at cleaning up the source for later reading, and there are apps and browser extensions for them so you can just click one button to save pages for later.
Raindrop is a very nice service for this
Or if you want to host one yourself I’d suggest giving wallabag a go
Thanks for the suggestion, but here is the issue… I am trying to get accurate local copies so I can use them for future research. Not saying you, but people who think you simply bookmark what you like and it will be there in the future are naive.
The Internet has become all about $$$, and companies are not motivated to leave knowledge out there without paying a price.
I see information come and go on a daily basis, so I do what I can to save the original so it is there when I need it in a week, month or years later.
Until recently, a simple File > Save Page As worked, but for some sites like the New York Times, forget about it!!
Using a combination of different browsers, different approaches, saving as HTML, web archives, PDF and screenshots, I thik I have a good enough approach for my nees, but it certianly is much harder than in the past.
And that is too bad, because I also try to save copies of things for “inspiration” - the new York Times for some gorgeous news articles online, but unfortunately they are a real bit to store locally without an enormous amount of work!
Wallabag and the other “read later” apps I’ve tried do save a local copy. If you run Wallabag locally (I have an unraid server I run lots of dockerized apps on) you get a local database of all the content which you can back up.
Wallabag does NOT give you the entire source though by default. So it might not fit your use case. But I’d expect some of the “read later” apps to.
It’s not that hard for me to find browser plug ins that strip out adds and just give you readable text. But as mentioned before, often times what I want to save includes lots of photos and infographics that further explain what you are reading about. Without that supporting material, the article falls flat.
Think about famous magazines like Time and Life and how “a picture is worth a thousand words”. (It was one thing to read about Vietnam, and quit another to watch a man shot in the head at point blank range in a photograph, right?)