Tutorial. Create scraper.

How I tried to create webscraper-parser with database integration and with analysis.

parse web with web scraper and Development|lower


BIG DEAL №1. SCRAPER-PARSER CREATION NOTES.

0. What is my plan:

1. To crawl website

1.1. To scrape web content.

2. Parse web content.

3. Write to database.

4. Check syntaxis and errores.

5. Check if google already has it.

6. Write article to my website.

 

1. To crawl webcontent.

1.1. What package to choose?

1.1.1. Headless browsers are best option: PhantomJS(not supported), Selenium(my choice). Also there are SlimerJS, Playwright (good but it is backed by Microsoft), Puppeteer, Rendertron, prerender.io, cypress, webdriverIO, cucumber, jest, mocha, testcafe, Jasmine, Robot Framework (also very popular), appium, serenity, gauge

MechanicalSoup python library, SCrapy library suits for spiders.

1.1.2. I need to pretend to be a famous crawler like yandexbot or googlebot.

1.2. What language to choose?

1.2.1. It is possible to use JavaScript (add to browser with UserScript(Violentmonkey addon for Chrome)). But I will use Python (most popular one).

1.3. Selenium isn’t a browser it is just a driver to use other browsers.

1.3.1. Check installed packages in virtual env $ python -m pip list

1.3.2. I need to pretend to be a famous crawler like yandexbot or googlebot.

SELENIUM

1.3.3. Install Selenium driver package from virtual env $ pip install selenium then install mozilla driver if  needed.

1.3.4. Start python shell for testing purposes just type  $ python

1.3.5. Wrie script to start browser Mozilla:

from selenium.webdriver import Firefox

from selenium.webdriver.firefox.options import Options

o = Options()

b = Firefox(options=o)

1.3.6. Start webpage from script:

b.get('https://www.amazon.com/s?rh=n%3A16225007011&fs=true&ref=lp_16225007011_sar')

b.set_window_size(1120, 550)  #Set window size

1.3.7. We have a list of items. I have to crawl them all. So I need to parse page to find items.

1.3.8. To choose element:

from selenium.webdriver.common.by import By

s = b.find_element(By.CLASS_NAME,'s-main-slot')

1.3.9. To check content use >>> s.text

1.3.10. To selected nested elements >>> s2 = b.find_element(By.CSS_SELECTOR,".s-main-slot .sg-col-4-of-12") or use s7 = b.find_elements(By.CLASS_NAME,"a-size-base-plus")

1.3.11. But I get only one element to get list of elements use:

s4 = b.find_elements(By.CSS_SELECTOR,".s-main-slot .sg-col-4-of-12 .sg-col-inner .s-widget-container .s-card-container .a-section .s-product-image-container .rush-component a")

1.3.12. To get attribute s4[0].get_attribute("href")

1.3.13. Form list from href attributes le = [c.get_attribute("href") for c in s4]

1.3.14. Then click next page on paginator:

p = b.find_element(By.CLASS_NAME,"s-pagination-next")

p.click()

1.3.15. Do not make http requests too often use:

import time 

time.sleep(5) #wait for 5 seconds

1.3.16. To open new TAB:

from selenium.webdriver.common.window import WindowTypes

b.switch_to.new_window(WindowTypes.TAB)

1.3.16.1. To open link in the new tab hold CTRL while click the link. Or set it by default for Firefox about:config -> browser.search.openintab -> TRUE // Doesn't helped 

browser.urlbar.openintab -> true // Doesn't work

browser.newtabpage.enabled -> true // Doesn't work

b.find_elements(By.CSS_SELECTOR,".a-price-whole")browser.tabs.loadBookmarksInTabs -> true // Doesn't help

browser.link.open_newwindow -> 3 // Doesn't helped

browser.link.open_newwindow.restriction -> 0 // probably I have to restart img37

restart firefox // Doesn't work

Seems to be Firefox doesn't keep changes when it starts with selenium driver. Start firefox as usual and repeate all changes mentioned above. // Doesn't helped because when you start firefox with selenium it gets default settings from selenium driver. Maybe it is possible to set options for selenium but I don't have time.

1.3.16.2. Better simmulate CTRL hold + click.

from selenium.webdriver.common.action_chains import ActionChains

ActionChains(b)\

        .key_down(Keys.CONTROL)\

        .click(s4[0])\

        .perform()

1.3.16.3. How to change tabs in browser. To switch into the parent tab use b.switch_to.window(b.window_handles[0]) to swith to the first tab b.switch_to.window(b.window_handles[1])

I faced problems it is better to use a = b.window_handles[1] and b = b.window_handles[0] then b.switch_to.window(a)

1.3.16.4. Alternative way to open new tab is b.switch_to.new_window('tab') -> b.get("https://sometthing") // Alternative way is to use b.back() and b.forward()

1.3.16.5. How to close tab? Just type b.close()

1.3.17. If you search Selenium examples in the internet remember deifference BetweenJavaCase and python_case.

1.3.18. To check HTML code of selected element use s4[0].get_attribute('innerHTML')

1.3.19. Get image with selenium.

1.3.19.1. To make screenshot use b.save_screenshot("/home/<some-path>/Pictures/<some-name>.png")

1.3.20. When I b.close() old tab and try to click next reference I get an error selenium.common.exceptions.NoSuchWindowException: Message: Browsing context has been discarded

1.3.20.1. Maybe I have to switch to main tab with b.switch_to.window(b.window_handles[0])  after b.close().

1.3.21. When I try to click element which is out of browser's view I get an error like selenium.common.exceptions.MoveTargetOutOfBoundsException: Message: (629, 1395) is out of bounds of viewport width (1280) and height (873) // So I have to determine if an element is out of view if true then scroll down to the element.

1.3.21.1. To get location x and y coordinates of element's top left corner use s4[0].location['x'] and s4[0].location['y'] or use s4[0].rect  // To get window's dimmension b.get_window_size()['height']

1.3.21.2. To scroll use javascript from python execute_script('window.scrollBy(0,300)')

1.3.21.3. To scroll to the element:

s2 = s4[19]

b.execute_script('arguments[0].scrollIntoView();', s2)

or the best way to scroll:

s4[20].location_once_scrolled_into_view

1.3.22. Sometimes I get an error IndexError: list index out of range It means I tried to switch on new tab which doesn't exist yet. Set up pause time.sleep(5)

1.3.23. To obtain current url in Selenium use b.current_url

1.3.24. To translate selenium script from java to python common tips are:  b.getTitle() is b.title and b.findElement(By.name("txt")); is b.find_element(by=By.NAME, value="text")

1.3.25. How to get xpath in firefox // Open developer tools -> inspector -> right click -> copy -> xpath

1.3.26. To find a link use e = b.find_element(By.PARTIAL_LINK_TEXT, "Link text")

1.3.27. To work with tables:

for(row=1; row<=5; row++) {

for(col=1; col <=3; col++) {

print(driver.findElement(By.xpath(“//div[@id='main']/table[1]/tbody/tr[“+row+”]/th[“+col+”]”)));

}

}

1.3.28. Translator creates popup and made translate button unclickable. So script stops by exception.

1.3.28.1. Add from selenium.common.exceptions import WebDriverException then I write:

                try:
                    b.find_element(By.CSS_SELECTOR, ".lmt__language_select--source button").click()
                    b.find_element(By.XPATH,"//span[contains(text(),'Russian')]").click()    # choose english language
                    b.find_element(By.CSS_SELECTOR, ".lmt__language_select--target").click()
                    langu = b.find_element(By.XPATH,"//span[contains(text(),'"+i+"')]").click() 
                except WebDriverException:
                    b.find_element(By.XPATH,"//button[contains(text(),'Back to Translator')]").click()    # Close the popup
                    #then repeat all
                    b.find_element(By.CSS_SELECTOR, ".lmt__language_select--source button").click()
                    b.find_element(By.XPATH,"//span[contains(text(),'Russian')]").click()    # choose english language
                    b.find_element(By.CSS_SELECTOR, ".lmt__language_select--target").click()
                    langu = b.find_element(By.XPATH,"//span[contains(text(),'"+i+"')]").click()

Doesn't help

1.3.28.2. It's also possible to check each time if popup is exist to bypass exception usage.

                   if not b.find_elements(By.XPATH,"//button[contains(text(),'Back to Translator')]"):
                        b.find_element(By.CSS_SELECTOR, ".lmt__language_select--source button").click()
                        b.find_element(By.XPATH,"//span[contains(text(),'Russian')]").click()    # choose english language
                        b.find_element(By.CSS_SELECTOR, ".lmt__language_select--target").click()
                        langu = b.find_element(By.XPATH,"//span[contains(text(),'"+i+"')]").click() 
                    else:
                        b.find_element(By.XPATH,"//button[contains(text(),'Back to Translator')]").click()
                        b.find_element(By.CSS_SELECTOR, ".lmt__language_select--source button").click()
                        b.find_element(By.XPATH,"//span[contains(text(),'Russian')]").click()    # choose english language
                        b.find_element(By.CSS_SELECTOR, ".lmt__language_select--target").click()
                        langu = b.find_element(By.XPATH,"//span[contains(text(),'"+i+"')]").click() 

Works

1.3.28.3. I have to restart deepl in case of popup.

1.3.28.4. Still it is not enough. Try to clear cookies. driver.delete_all_cookies()

 

URLLIB3

1.3.19.2. import urllib

urllib.urlretrieve(url,"/home/john/Pictures/mytest002.jpg") // Doesn't work

urllib.request.urlretrieve(url,"/home/john/Pictures/mytest002.jpg") // It works

1.3.19.2.1. For urllib3 below. 

import urllib3

a = urllib3.PoolManager()

req = a.request('GET', 'https:\\some.site\file', preload_content=False)

with open('<some local path>/<some existing file>, 'wb') as somefile:

    while True:                  // Doesn't work

        d = req.read()          // Doesn't work

        if not d:                    // Doesn't work img37

            break                   // Doesn't work

        somefile.write(d)      // Doesn't work

    somefile.write(req.data)

somefile.close()

req.release_conn()

1.3.19.3. I've got problem. When I am trying to get some html element which doesn't exists I've got an error which stops script execution. Like if(b.find_element(By.CSS_SELECTOR,"#SomeElement")): cause an error "selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: #" 

1.3.19.4. I'll try to get list with if(len(b.find_elements(By.CSS_SELECTOR,"#SomeElement"))): // WORKS.

1.3.19.5. If you get selenium.common.exceptions.MoveTargetOutOfBoundsException error means you have to scroll down page to needed element before to click this element.

1.3.20. When I get image and set image name based on an item title I have to sanitize symbols for the name (delete quotes " etc). // with open('/home/Pictures/000-urllib'+ str(a_tit.text) +'.jpg','wb') as img:

1.3.20.1. To filter caracters use s = "".join(filter(str.isalnum, mystring))                             

 

PILLOW

1.3.20. How to edit images in python.

1.3.20.1. Choose tool. Use Pillow library, Presidio (but it is from Microsoft), OpenCV lib (Powerfull and Opensource), ImageIO.

1.3.20.2. $ pip install Pillow -> python -> from PIL import Image

example to use pillow for resize image:

dim = (300, 300)

name = "/home/<some-path>/<cualquer-file>.png"

output = name + ".thumbnail"

ima = Image.open(name)

ima=ima.thumbnail(dim, Image.Resampling.LANCZOS) // To increase size ima=ima.resize(dim, Image.Resampling.NEAREST)

ima.save(output, "PNG") // format shoul be same as file name

1.3.20.3. To make black&white image:     

imgL = ima.convert("L")

 // or get video file imgL.show()

1.3.20.4. To make crop img2 = im.convert("1")

1.3.20.5. To get image size in bytes:

import os

print(os.path.getsize(<file-name-and-path>))

or img.size to get dimensions.

1.3.20.6. To convert an image into JPG:

jpg = ima.convert("RGB")

jpg.save('/home/<path>/<new-name>.jpg', "JPEG")

1.3.20.7. How to rotate an image: img.rotate(90).save("wrotated.png")

1.3.20.8. How to crop img = img.crop((0, 0, img3.width/2, img3.height/2)).show()

1.3.20.9. How to paste (insert image) img.paste(img1, (100,100)) img1 should be smaller, (100,100) is a padding from top left corner.

1.3.20.10. To mirror image mirror = img.transpose(Image.FLIP_LEFT_RIGHT)

1.3.20.11. To apply blur:

from PIL import ImageFilter

img.filter(ImageFilter.GaussianBlur(radius = 1)).show() or img.filter(ImageFilter.BoxBlur(1)) or img.filter(ImageFilter.ModeFilter(size = 3)).show() or img.filter(ImageFilter.MinFilter(size = 3))

or ima=ima.filter(ImageFilter.BLUR) or c2=b.filter(ImageFilter.SMOOTH)

1.3.20.12. ImageChops.invert(img).show() to invert colors

1.3.20.13. To make sharp img.filter(ImageFilter.UnsharpMask(radius = 2, percent = 150, threshold = 4)).show() or ima=ima.filter(ImageFilter.DETAIL) or ima=ima.filter(ImageFilter.EDGE_ENHANCE) or ima=ima.filter(ImageFilter.SHARPEN) 

1.3.20.14. To intensify color and sharpness:

from PIL import ImageEnhance

img = ImageEnhance.Color(img) or img = ImageEnhance.Sharpness(img)

img.enhance(3.0).show()

1.3.20.15. To regulate contrast and brightness:

img = ImageEnhance.Contrast(img) or img = ImageEnhance.Brightness(img)

img.enhance(3.0).show()

or im.point(lambda i: i * 4.2).show()

1.3.20.16. Add borders:

from PIL import ImageOps

img = ImageOps.expand(img, border = 20, fill = 50).show()

1.3.20.17. Create one image from three components. 

img3 = Image.composite(img1, img2, maskilon) or img = ImageChops.darker(img1, img2) or img = ImageChops.screen(img1, img2) or img = ImageChops.lighter(img1, img2)

1.3.20.18. To create new image img = PIL.Image.new(mode = "RGB", size = (911, 119), color = (033, 124, 088))

1.3.20.19. Cool stuff. Create fractal im = Image.effect_mandelbrot((1512,1512), (-3,-2.5,2,2.5), 100).show()

1.3.20.20. Set offset ImageChops.offset(img, 30, yoffset=None).show()

1.3.20.21. To add some text to image:

from PIL import ImageDraw, ImageFont // or get video file 

text = Image.new("RGBA", img.size, (255,255,255,0)) # Create blank transparent image for text

font = ImageFont.truetype("Pillow/Tests/fonts/FreeMono.ttf", 70) # Set font

df = ImageDraw.Draw(text) # set context

df.text((60, 60), "Worldends", font=font, fill=(255, 0, 255, 255)) # Set text, place, color, font or d.multiline_text((10, 10), "First word \n Second word", font=font, fill=(0, 0, 0))

Image.alpha_composite(img, text).show()

1.3.20.22. ImageOps.solarize(img, threshold=99) increase brightness for pixels below treshold. Or fancy effect ima=ima.filter(ImageFilter.MaxFilter(size=4)) to make bright pixels brighter. Good for dark images.

1.3.20.23. Emboss effect: from PIL import ImageFilter -> ima = ima.filter(ImageFilter.EMBOSS)

1.3.20.24. Contour effect: from PIL import ImageFilter -> ima = ima.filter(ImageFilter.CONTOUR)

1.3.20.25. Fancy effect ima=ima.filter(ImageFilter.MaxFilter(size=4)) to make dark pixels darker. Good for bright images.

1.3.20.26. Effect to highlight edges ima=ima.filter(ImageFilter.FIND_EDGES)

 

OpenCV

1.3.21.1. Install $ pip install opencv-python 

1.3.21.1.1. How to renew package // $ pip install opencv-python --upgrade

1.3.21.2. Get an image

import cv2

img = cv2.imread('/<some-path>/<some-file>.jpg')

1.3.21.3. To save an image img2 = cv2.imwrite('<path>/<name>.png',img)

1.3.21.4. To crop img2 = cv2.imwrite('<path>/<name>.png',img[100 : 500, 200 : 500])

1.3.21.5. To resize img = cv2.resize(img, (300, 300))

1.3.21.6. To rotate an image:

he, we = img.shape[:2] // get image size

img2 = cv2.warpAffine(img, cv2.getRotationMatrix2D( (he//2,we//2), -45, 1.0), (w,h) )

1.3.21.7. To draw a rectangle: rectangle = cv2.rectangle(img.copy(), (100,100), (500,500), (0,255,0), 3)

1.3.21.8. Add text: text = cv2.putText(o, "Add some text", (300, 300), cv2.FONT_HERSHEY_SCRIPT_COMPLEX, 4, (255,255,0), 2)

1.3.21.9. Merge two images img3 = cv2.addWeighted(img1, 0.2, img2, 0.8, 0)

1.3.21.10. Erosion effect:

import numpy as np

img2 = cv2.erode(img, (np.ones((5, 5), np.uint8)) )

1.3.21.12. Add border img2 = cv2.copyMakeBorder(img, 10, 50, 50, 50, cv2.BORDER_REFLECT, None, value = 0)

1.2.21.13. Make grayscale img2 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

 // or get video file

1.2.21.14. To blur img2=cv2.blur(img, (5,5)) or img2=cv2.medianBlur(img, 5) or img2=cv2.GaussianBlur(img, (5, 5), 0)

1.2.21.15. To draw a line cv2.imwrite("<path>/img17.png", cv2.line(img, (5,5), (200,200), (255,0,0), 9)  // 9 is a thickness

1.2.21.16. To draw an arrow cv2.imwrite("/home/img18.png", cv2.arrowedLine(img, (500,500), (300,500), (0,255,0), 5))  

1.2.21.17. To draw ellipse cv2.imwrite("/home/john/Pictures/img19.png", cv2.ellipse(img, (800,800), (100,50), 45, 0, 300, (0,0,255),4 )) // To make it solid use thickness=-1

1.2.21.18. To capture video from webcam:

cap1 = cv2.VideoCapture(0) // or get video file cap1 = cv2.VideoCapture("/home/myhomevideo.mp4")

ret,frame = cap1.read() // to stop camera cap1.release()

cv2.imwrite("/home/Pictures/img22.png", frame)

1.2.21.19. To flip image horizontaly (mirror) use:

cv2.flip(img,-1) // Flips vertical

cv2.flip(img,1// horizontal

1.2.21.20. To convert png into jpg is very easy cv2.imwrite("/home/Pictures/img29.jpg", img)

1.2.21.21. To change brightness:

img37 = cv2.cvtColor(img37, cv2.COLOR_BGR2HSV) // to change color balance img37 = cv2.cvtColor(img37, cv2.COLOR_HSV2BGR)

h, s, v = cv2.split(img37)  // to change color balance r, g, b = cv2.split(img37) 

v = cv2.add(v,30)

v[v>255] = 255

v[v<0] = 2

img42 = cv2.merge((h,s,v))   // to change color balance img42 = cv2.merge((r,g,b)) 

 

1.4 To debug python in Eclipse

1.4.1. Create new PyDev project -> Next. Choose Directory of your script .py // Eclipse doesn't see selenium packages.

1.4.2. Try to activate virtual enviroment then start ./eclipse // Doesn't help

1.4.3. Open Eclipse->Preferences->PyDev->Interpreters->Python Interpreter -> Packages -> Manage with pip -> Install selenium // It works

 

1.5 Write to local mysql database

1.5.1. Firstable I need to install MySQL Connector driver to work with database from python. $ python -m pip install mysql-connector-python   

1.5.2. Create mysql object:

import mysql.connector

db = mysql.connector.connect(

    host="localhost",

    user="123pmauser321",

    password="aa12345bb",

    database="scraper")

print(db)

1.5.3. To execute first command to show databases:

m = db.cursor()

m.execute("SHOW DATABASES")

for x in m:

      print(x)

1.5.4. How to insert data into:

m.execute("INSERT INTO `first` (`tit`, `ov`, `descr`, `price`, `imag`) VALUES ('el titulo 2', NULL, '', '', '');") 

db.commit()

1.5.5. Create autoincreamented ID:

m.execute("ALTER TABLE `first` ADD `idot` INT NOT NULL AUTO_INCREMENT FIRST, ADD PRIMARY KEY (`idot`);")

db.commit()

1.5.6. Create autoincreamented ID:

m.execute("ALTER TABLE `first` ADD `date` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP AFTER `idot`;")

db.commit()

1.5.7. To get row from database:

m.execute("SELECT * FROM first;")

r = m.fetchall()

for x in r:

    print(x)

1.5.8. Insert multiple value:

sql = "INSERT INTO first (tit, descr) VALUES (%s, %s)"

v = [ ('John', 'Hustler'),

        ('Amy', 'Whinehouse'),

        ('Hannah', 'Ocean') ]

m.executemany(sql, v)

db.commit()

1.5.9. When I try to use variable from one selenium's tab in other selenium tab I get an error selenium.common.exceptions.NoSuchElementException: Message: Web element reference not seen before: // To figure this out in old tab just copy elements text to new variable like txt = elem.text

1.5.10. I have to generate uniq id for new mySQL record. I have to get last record's ID and increment it by one.// Just set database column as auto_incremented.

1.5.11. To ger last sql row use: m.execute("SELECT `product_id` FROM `products` ORDER BY `product_id` DESC LIMIT 1; ") // I get error while use ORDER BY. I have 5.7.27 sql version.

1.5.12. Create random auto meta description from an overview list use: import random -> random.choice(mylist)

 

1.6 Translate database rows

1.6.1. Use deepl translator it is best service.

1.6.2. I won't create a new row for translations. I am going to add new columns in the same raw.

1.6.3. I have to set languages up in Deepl. To find element by text use t12 = b.find_element(By.XPATH,"//span[contains(text(),'English')]")

1.6.4. To clear input field inp.clear()

 

1.7 Automaticly write to joomla's virtuemart MySQL database and publish

1.7.1. Product add into _virtuemart_products, _virtuemart_products_en, _virtuemart_product_categories table

1.7.2. How to create mysql db connection with remote database.

1.7.3. In host ISP manager - > Databases -> Users -> Allow remote connections

1.7.4. While trying to connect remotely with python connector I've got an error _mysql_connector.MySQLInterfaceError: Can't connect to MySQL server on '8.8.8.220:3306' // I just enter wrong server's IP

1.7.5. To insert remotly into vurtuemart_products:

>>> m.execute("INSERT INTO `<prefix>_virtuemart_products` (`virtuemart_product_id`, `virtuemart_vendor_id`, `product_parent_id`, `product_sku`, `product_gtin`, `product_mpn`, `product_weight`, `product_weight_uom`, `product_length`, `product_width`, `product_height`, `product_lwh_uom`, `product_url`, `product_in_stock`, `product_ordered`, `product_stockhandle`, `low_stock_notification`, `product_available_date`, `product_availability`, `product_special`, `product_discontinued`, `product_sales`, `product_unit`, `product_packaging`, `product_params`, `hits`, `intnotes`, `metarobot`, `metaauthor`, `layout`, `published`, `pordering`, `created_on`, `created_by`, `modified_on`, `modified_by`, `locked_on`, `locked_by`) VALUES ('333', '1', '0', 'petimetr_sku_111', 'special_product_petimetr_gtin_222', '', '222.0000', 'G', '690.0000', '69.0000', '69.0000', 'MM', '', '911', '0', '0', '0', '2018-03-03 00:00:00', '', '1', '0', '0', 'KG', NULL, 'min_order_level=\"\"|max_order_level=\"\"|step_order_level=\"\"|product_box=\"1\"|', NULL, '', '', '222 pushimeters', '', '1', '0', '2018-03-03 15:00:36', '468', '2022-09-17 05:57:23', '468', '0000-00-00 00:00:00', '0');   ")

>>> db.commit()

1.7.6. To insert remotly into vurtuemart_products_en:

>>> m.execute("INSERT INTO `<prefix>_virtuemart_products_ru_ru` (`virtuemart_product_id`, `product_s_desc`, `product_desc`, `product_name`, `metadesc`, `metakey`, `customtitle`,`slug`) VALUES ('333', '333 product s description', '333 product description', '333 ITEM product name', '333 metadescription meta', '333, desc, meta, item', '333 CUSTOM title 333', '333-item-slug');")

>>> db.commit()

1.7.7. It is important to use ` instead of ' .

1.7.8. To insert price:

>>> m.execute("INSERT INTO `<prefix>_virtuemart_product_prices` (`virtuemart_product_price_id`, `virtuemart_product_id`, `virtuemart_shoppergroup_id`, `product_price`, `override`, `product_override_price`, `product_tax_id`, `product_discount_id`, `product_currency`, `product_price_publish_up`, `product_price_publish_down`, `price_quantity_start`, `price_quantity_end`, `created_on`, `created_by`, `modified_on`, `modified_by`, `locked_on`, `locked_by`) VALUES (NULL, '333', '0', '3333.000000', '0', '0.00000', '-1', '-1', '144', '0000-00-00 00:00:00', '0000-00-00 00:00:00', '0', '0', '2018-03-12 18:12:02', '468', '2018-03-14 09:02:58', '468', '0000-00-00 00:00:00', '0');")

>>> db.commit()

1.7.9. To insert an image to remote server by ssh use scp tool.

$ scp /home/Pictures/mytest17.png u022424311@137.140.192.232:/var/www/u022424311/data/www/rus-equip.com/images/virtuemart/product/computer/

then enter password.

1.7.10. To implement Linux command like scp from Python shell use os library or subprocess lib nased on websocket TCP protocol. // It doesn't help to interact with subprocess. Use pexpect.

>>> c = 'ls'

>>> t = subprocess.Popen([c,'-l'], stdout = subprocess.PIPE) // or subprocess.run(["ls","-l"]) // Popen starts new process

>>> print( str(t.communicate()) ) // To get output

or

>>> os.system("ls -l")

1.7.11. The problem is I have to enter password somehow. Maybe pexpect lib will help to install $ pip install pexpect.

images/virtuemart/product/2022-09-25_01-04-0_SanDisk-128GB-Ultra-microSDXC-UHS-I-Memory-Card-with-Adapter-120MB-s-C10-U1-Full-HD-A1-Micro-SD-Card-SDSQUA4-128G-GN6MA_2.jpg1.7.12. To interact with scp:

import pexpect

c = pexpect.spawn('scp someuser@somesite.com:.')

c.expect('(?i)password:')

c.sendline(mypass)

 

1.8 Virtuemart

1.8.1. Unable to show image into Virtuemart. I get vmError: Couldnt create thumb, file not found /var/www/u0224243/data/www/rus-equip.com/images/virtuemart/typeless//images/virtuemart/product/computer/2022-09-25_01-11-3_Logitech-H390-Wired-Headset-Stereo-Headphones-with-Noise-Cancelling-Microphone-Black_2.jpg // I have to put resized small copy of the image into resize folder. //Just save item after publish.

1.8.2. So I have to resize image to make it 8kb.

1.8.3. Virtuemart doesn't show items in category list when english Language is selected. When I change item category it works good. Looks like problem with category settings with eng lang. // Menu -> Menu eng -> Selecet manufacturer -> Please Select

 

1.9 Use VPN

1.9.1. I faced problem with server's bot detection. I have to use some VPN or proxy.

 

Check duplicates. Make tuple from list a=tuple(a)

Set useragent for selenium

Clear Image metadata