{"id":233246,"date":"2023-02-09T16:25:00","date_gmt":"2023-02-09T13:25:00","guid":{"rendered":"https:\/\/wordpress.mediadoma.com\/?p=233246"},"modified":"2022-11-10T20:05:54","modified_gmt":"2022-11-10T17:05:54","slug":"tapaustutkimus-kaeytae-phpqueryae-indeksoimaan-3000-tumblr-kuvia","status":"publish","type":"post","link":"https:\/\/wordpress.mediadoma.com\/fi\/tapaustutkimus-kaeytae-phpqueryae-indeksoimaan-3000-tumblr-kuvia\/","title":{"rendered":"Tapaustutkimus: k\u00e4yt\u00e4 PHPQuery\u00e4 indeksoimaan 3000 Tumblr-kuvia"},"content":{"rendered":"<p>Tumblrissa on hyvi\u00e4 kuvia. Voimme k\u00e4ytt\u00e4\u00e4 Tumblr-sovellusliittymi\u00e4 kuvien etsimiseen ja lataamiseen, mutta t\u00e4m\u00e4 vaatii yleens\u00e4 rekister\u00f6innin ja API-avaimia. Toinen tapa on indeksoida HTML-verkkosivut ja j\u00e4sent\u00e4\u00e4 <a href=\"https:\/\/helloacm.com\/using-domdocument-in-php-to-process-output-html\/\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">DOM (Document Object Model)<\/a>, jonka avulla voimme hakea kuvien URL-osoitteet ja niiden kuvaukset.<\/p>\n<p>On k\u00e4tev\u00e4 kirjasto, jota kutsutaan nimell\u00e4 PHPQuery. Sen avulla voimme kirjoittaa JQuery-tyylist\u00e4 <a href=\"https:\/\/helloacm.com\/multilingual-bug-fix-php-7-wordpress-4-4-compatibility-wp-rocket-2-6-14\/\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">PHP<\/a> :t\u00e4 samalla tavalla kuin kirjoitamme <a href=\"https:\/\/helloacm.com\/jquery-examples-random-squares\/\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">JQuery\u00e4<\/a> k\u00e4ytt\u00e4m\u00e4ll\u00e4 CSS-valitsimia. Se tekee PHP:st\u00e4 tehokkaan mink\u00e4 tahansa HTML-sivun DOM:n analysoinnissa.<\/p>\n<p>phpquery<\/p>\n<p>Seuraava on psudo-koodi, joka havainnollistaa kuinka j\u00e4sent\u00e4\u00e4 <a href=\"https:\/\/helloacm.com\/html-tip-speed-up-dns-query-by-dns-prefetch\/\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">HTML<\/a> &#8211; sivuja ja napata kuvat.<\/p>\n<p><a href=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fhelloacm.com%2Fcase-study-use-phpquery-to-crawl-3000-images-from-tumblr%2F&#038;text=With%20PHPQuery%2C%20it%20becomes%20so%20much%20easier%20to%20analyse%20the%20DOM%21&#038;via=doctorzlai&#038;related=doctorzlai\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">PHPQueryn avulla DOM:n analysointi on paljon helpompaa!<\/a> <a href=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fhelloacm.com%2Fcase-study-use-phpquery-to-crawl-3000-images-from-tumblr%2F&#038;text=With%20PHPQuery%2C%20it%20becomes%20so%20much%20easier%20to%20analyse%20the%20DOM%21&#038;via=doctorzlai&#038;related=doctorzlai\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">Napsauta Twiittaamaan<\/a><\/p>\n<pre><code>require('phpQuery.php');\nrequire('app.php');\n\u00a0\n$ip = get_ip_address();\n\u00a0\nfunction grab($url, $lvl = 5) {\n\u00a0 global $ip;\n\u00a0 if ($lvl &lt; = 0) {\n\u00a0 \u00a0 return;\n\u00a0 }\n\u00a0 $doc = phpQuery::newDocumentFile($url);\n\u00a0 foreach(pq('div.TumbPostPane') as $p) {\n\u00a0 \u00a0 \u00a0 $img = pq($p)-&gt;find('img.PhotoPostMainPhoto')-&gt;attr('src');\n\u00a0 \u00a0 \u00a0 $desc = htmlspecialchars(trim(pq($p)-&gt;find('div.MetaPanel')-&gt;html()));\n\u00a0 \u00a0 \u00a0 $url = pq($p)-&gt;find('a')-&gt;attr('href');\n\u00a0 \u00a0 \u00a0 $err = '';\n\u00a0 \u00a0 \u00a0 if (UploadPic($img, $desc, $err, $ip)) { \/\/ find pictures and save locally\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0echo \"OK = $err n\";\n\u00a0 \u00a0 \u00a0 } else {\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0echo str_replace(\"&lt;br \/&gt;\", \"n\", \"Error = $err n\");\n\u00a0 \u00a0 \u00a0 } \u00a0 \u00a0 \n\u00a0 \u00a0 \u00a0 grab($url, $lvl - 1); \/\/ recursive download\n\u00a0 }\n}\n\u00a0\ngrab(\"&lt;a class=\"vglnk\" target=\"_blank\" href=\"https:\/\/uploadbeta.com\" rel=\"nofollow\"&gt;&lt;span&gt;https&lt;\/span&gt;&lt;span&gt;:\/\/&lt;\/span&gt;&lt;span&gt;uploadbeta&lt;\/span&gt;&lt;span&gt;.&lt;\/span&gt;&lt;span&gt;com&lt;\/span&gt;&lt;\/a&gt;\", 1); <\/code><\/pre>\n<p>Pienell\u00e4 muutoksilla voit antaa skriptin indeksoida useita tuhansia kuvia muutamassa minuutissa. Kaikki kuvat tallennetaan paikallisiin tietokantoihin <a href=\"https:\/\/helloacm.com\/ping-when-vpsdedicate-server-is-restarting\/\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">VPS &#8211; palvelimella<\/a>. Kuvat ovat n\u00e4ht\u00e4viss\u00e4 osoitteessa: <a href=\"https:\/\/uploadbeta.com\/picture-gallery\/?sort=1\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">uploadbeta.com<\/a><\/p>\n<p>On parempi asettaa aikav\u00e4li sivun indeksoinnin v\u00e4lill\u00e4, muuten <a href=\"https:\/\/helloacm.com\/what-is-my-ip\/\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">IP-osoite<\/a> voi olla estetty.<\/p>\n<p><a href=\"https:\/\/wordpress.mediadoma.com\/wp-content\/uploads\/2022\/01\/post-154242-61e53c0a386d6.jpg\" data-rel=\"lightbox\"><img decoding=\"async\" class=\"SDStudio-light-box-enable SDStudio-editor-tools-md-imp\" src=\"https:\/\/wordpress.mediadoma.com\/wp-content\/uploads\/2022\/01\/post-154242-61e53c0a386d6.jpg\" alt=\"Tapaustutkimus: k\u00e4yt\u00e4 PHPQuery\u00e4 indeksoimaan 3000 Tumblr-kuvia\" ><\/a><\/p>\n<p>Kuvan indeksointi<\/p>\n<p>PS, Image Upload -sivusto tukee muutamia sovellusliittymi\u00e4 eri tarkoituksiin kohtuullisen k\u00e4yt\u00f6n periaatteiden mukaisesti: <a href=\"https:\/\/uploadbeta.com\/picture-gallery\/faq.php#api\" target=\"_blank\" rel=\"noopener nofollow\" class=\"external external_icon\">https:\/\/uploadbeta.com\/picture-gallery\/faq.php#api<\/a><\/p>\n<p><div id=\"PostUnique_PostSource\" style=\"padding-top: 50px\">:  <a target=\"_blank\" rel=\"noopener nofollow\" href=\"\/\/helloacm.com\" class=\"external external_icon\">helloacm.com<\/a><\/div><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Tapaustutkimus: k\u00e4yt\u00e4 PHPQuery\u00e4 indeksoimaan 3000 Tumblr-kuvia<\/p>\n","protected":false},"author":1,"featured_media":220869,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_wp_rev_ctl_limit":""},"categories":[895,1018,895,917,917,1110,1018,843,803,803,843],"tags":[1166],"class_list":{"0":"post-233246","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","6":"hentry","7":"category-koodi","8":"category-hyodyllisia-sivustoja","10":"category-muut","12":"category-n-a","14":"category-opetusohjelmia","15":"category-php-5","18":"tag-affiai-fi"},"_links":{"self":[{"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/posts\/233246","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/comments?post=233246"}],"version-history":[{"count":0,"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/posts\/233246\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/media\/220869"}],"wp:attachment":[{"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/media?parent=233246"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/categories?post=233246"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wordpress.mediadoma.com\/fi\/wp-json\/wp\/v2\/tags?post=233246"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}