Sciencemadness Discussion Board
Not logged in [Login ]
Go To Bottom

Printable Version  
Author: Subject: GREAT old books site
a_bab
Hazard to Others
***




Posts: 458
Registered: 15-9-2002
Member Is Offline

Mood: Angry !!!!!111111...2?!

[*] posted on 30-3-2003 at 13:42
GREAT old books site


I just found a french site which is some sort of MOA site. You guys may know that the MOA site doesn't let you to save all the book (as NAP edu). Instead, you have to browse the wanted book page by page and save it page by page. A nasty task when you are dealing with a 400 pages (or more) book.

Well, this site let's you save the whole book in a nice clear pdf format. I am saving right now "Elements of chemistry, theoretical and practical / by D. B. Reid" and it's about 40 Mb (over 900 pages). I suppose that it could be compressed with the Xerox tool as the pager are B&W.

A simple query having "chimie" (chemistry in french) as domain returned me more than 500 entries ! The good news is that there are SOME books in English (maybe less than 10 % as I saw). This is normal, becase the site is a french one, so nearly all the books are french. There are lots of german books aswell, russian and others.

So, for those which are loving the old science (chemistry in my case) books, happy downloading !

Oops, nearly missed. The address is this one
View user's profile View All Posts By User
Blind Angel
National Hazard
****




Posts: 845
Registered: 24-11-2002
Location: Québec
Member Is Offline

Mood: Meh!

[*] posted on 30-3-2003 at 19:12


Since my mother tongue is french i was all happy with seeing that this site exist so i made a search for "chimie organique", this returned me about 89 result. But there a big problem: every time i try to dwl a file it tell me: this file is protected by copyright and can be dwl :( :(



/}/_//|//) /-\\/|//¬/=/_
My PGP Key Fingerprint: D4EA A609 55E4 7ADD 8529 359D D6E2 33F6 4C76 78ED
View user's profile View All Posts By User This user has MSN Messenger
a_bab
Hazard to Others
***




Posts: 458
Registered: 15-9-2002
Member Is Offline

Mood: Angry !!!!!111111...2?!

[*] posted on 31-3-2003 at 12:03


What book did you tried to download ? It worked fine for me. There may be a few which are copyrighted though. Dunno...
View user's profile View All Posts By User
Polverone
Now celebrating 21 years of madness
*********




Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline

Mood: Waiting for spring

[*] posted on 31-3-2003 at 12:17
had the same problem


More often than not, the books that interested me were copyrighted. I tried to grab Beilstein and couldn't, for example.
View user's profile Visit user's homepage View All Posts By User
a_bab
Hazard to Others
***




Posts: 458
Registered: 15-9-2002
Member Is Offline

Mood: Angry !!!!!111111...2?!

[*] posted on 31-3-2003 at 12:26


Damn. Is it possible to get it page by page ? I can set up something if necessary.
The link worths for the old ones though.
View user's profile View All Posts By User
Polverone
Now celebrating 21 years of madness
*********




Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline

Mood: Waiting for spring

[*] posted on 31-3-2003 at 12:31
sorry


I was actually using the page-by-page system, and it wouldn't let me view any of Beilstein. I'm not sure why they even list books online that they have no intention of allowing people to view, but I didn't build the site.
View user's profile Visit user's homepage View All Posts By User
tangent
Harmless
*




Posts: 17
Registered: 14-3-2003
Member Is Offline

Mood: introspective..

[*] posted on 1-4-2003 at 00:33


Think about PERL for automating fetching and processing.

This book is online.

http://www.oreilly.com/catalog/webclient/

this one is more current, but you need to buy a copy or use the library:

http://www.oreilly.com/catalog/perllwp/

other resources:

http://www.linpro.no/lwp/

http://ftp.ics.uci.edu/pub/websoft/libwww-perl/

http://search.cpan.org/author/GAAS/libwww-perl/lwpcook.pod

other options include Micro$hafts “save page/site/directory offline” as part of IE’s browser and Adobe Acrobat 5’s ability to gather and convert web pages and convert them into various formats.

There are a variety of programs out there that will harvest news or data of various types including archiving specific sites.

Also check out this one next time your at the bookstore, and the online examples.

http://www.oreilly.com/catalog/googlehks/


-t




...Politics then reigned in Rome. She had her two sisters, Deceit and Greed, as ministers. Under their command, Ignorance, Fanaticism and Fury were seen to prevail in Europe. They brought wretchedness everywhere they went. Reason hid in the bottom of a well, along with her daughter, Truth. No one knew where this well was. If anyone had guessed, he would have gone down into it to cut the throats of mother and daughter.

-Voltaire, Eloge historique de la Raison, 1775

Throughout the world ... we use the word \'politics\' to describe the process so well: \'Poli\' in Latin meaning \'many\' and \'tics\' meaning \'blood****ing creatures\'.
View user's profile View All Posts By User
Organikum
resurrected
*****




Posts: 2329
Registered: 12-10-2002
Location: Europe
Member Is Offline

Mood: busy and in love

[*] posted on 1-4-2003 at 14:02


Beilstein III works for me page by page, but how do I get the the whole file in one piece?
I don´t speak french at all.
View user's profile View All Posts By User
Polverone
Now celebrating 21 years of madness
*********




Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline

Mood: Waiting for spring

[*] posted on 1-4-2003 at 14:20


Your advice in general is good, Tangent. Have you ever tried the Google Web API system? It's really nice for automated queries! I haven't used it since last summer, though, so I don't know if it ever moved beyond the limitations of the beta stage.

However, the books of the BNF archive can be downloaded as single-file PDFs if you know the unique identification code for the work in question. Web-rippers and offline browsers are great for situations where you can't just get one big file, like my Muspratt pages.
View user's profile Visit user's homepage View All Posts By User
blazter
Hazard to Self
**




Posts: 71
Registered: 3-9-2002
Member Is Offline

Mood: No Mood

[*] posted on 1-4-2003 at 15:40
linux web mirroring


Anyone who runs linux should already have a very powerful web mirroring tool installed - wget. Anything that wget cannot do can be done by another tool called pavuk which should be available at freshmeat.net . In a previous life someone I knew saw these tools being used to rip complete ebook web sites :D
View user's profile View All Posts By User
a_bab
Hazard to Others
***




Posts: 458
Registered: 15-9-2002
Member Is Offline

Mood: Angry !!!!!111111...2?!

[*] posted on 1-4-2003 at 23:33


There are several tols for ripping off an entire site. These tools are called Web Spiders. The best one (in my oppinion) in Teleport Pro. Then Snake, etc. You can get the entire site, relink the files, recreate the folders structure as on the server, etc.




Organikum, in order to download any book from the gallica site, you have to click on the "Téléchargement de l'ouvrage" option (right-up) and to have for "Choisissez le début de votre sélection:", "La 1ère page" set as on, and for "Choisissez le nombre de pages : " to have "Jusqu'à la fin de l'ouvrage" on aswell. That means that the beggining of the selection is set as the first page of the book, and the number of pages to be downloaded is "until the end of the book". Than click on the Fishier PDF. You may encounter troubles due to the heavy traffic; all you have is to wait and try again. If it works, you'll get a ling to download the book.
View user's profile View All Posts By User
Organikum
resurrected
*****




Posts: 2329
Registered: 12-10-2002
Location: Europe
Member Is Offline

Mood: busy and in love

[*] posted on 2-4-2003 at 13:56


thanks a_bab, I´ll try it this way.
View user's profile View All Posts By User

  Go To Top