Category Archives: Past

Imagebot setup.py and Installation

I developed this scraper about 10 years ago. Since then, a lot of things must have changed. Going through the file, ‘setup.py’, I analyzed the code in there. There are plenty of actions in the file. I need to revisit those, and, probably need to look up documentation related to the functions/python concepts I have used in the file.

I chose to install the package.

pip3 install imagebot-1.2.1.tar.gz

This command emitted an error, ‘externally-managed-environment’. I tried another approach.

pip3 install imagebot

This command emitted the same error. Probably, I need to find out something related to Mac python environment, virtual environment, etc.

ImageBot Setup

I am going through some old code that I have written. This particular package, published on PyPI uses the ‘scrapy’ Python package to scrape a website for images.

The setup configuration can be viewed inside the ‘setup.py’ file. This file is generally added to a Python project to make it a package easily available for installation via some tool like, ‘pip’ and via some website like, ‘pipy.org’. The ‘setup.py’ file makes it.easy to mention the dependencies, which can be automatically installed by the package installer. This ‘imagebot’ package depends on a few external Python packages, like, ‘scrapy’, a Python scraping package, ‘PIL/Pillow’, a Python image processing library and ‘mutils’, a Python library written by me to include commonly used Python code. I think, that, this ‘setup.py’ is the file executed first by a package installer. One can mentioned metadata, like, author, version, categories, email, description, etc. in this file. One can also include additional code execution in this file, the code necessary to be executed at the installation time fits this kind of inclusion.

import ez_setup
ez_setup.use_setuptools()
from setuptools import setup, find_packages
import platform
import imp
from imagebot.version import version
with open('imagebot/env.py', 'w') as f:
f.write('env = \'release\'\n')
install_requires = ['scrapy>=1.0.1', 'mutils>=1.0.4']
try:
imp.find_module('PIL')
except ImportError:
install_requires.append('Pillow')
entry_points = {}
entry_points['console_scripts'] = ['imagebot=imagebot.main:main']
setup( name = 'imagebot',
description = 'A web bot to crawl websites and scrape images.',
version = version,
author = 'Amol Umrale',
author_email = 'babaiscool@gmail.com',
url = 'http://pypi.python.org/pypi/imagebot/',
packages = find_packages(),
package_data = {'imagebot': ['tables.sql']},
scripts = ['ez_setup.py'],
entry_points = entry_points,
install_requires = install_requires,
classifiers = [
'Development Status :: 4 - Beta',
'Environment :: Console',
'Framework :: Scrapy',
'Framework :: Twisted',
'License :: OSI Approved :: MIT License',
'Natural Language :: English',
'Operating System :: POSIX :: Linux',
'Operating System :: Microsoft :: Windows',
'Programming Language :: Python :: 2.7',
'Topic :: Internet :: WWW/HTTP :: Indexing/Search'
]
)
with open('imagebot/env.py', 'w') as f:
f.write('env = \'dev\'\n')

Imagebot

I think. That, going through my past projects, visiting the code related to them might help me revisit my own philosophy during those days, those days belong to the time period more than 10 years ago. Then, I was working on many projects on my own, pushing code to GitHub, and, to PyPI. Those days may be over now, I might be in a burnout. Revisiting the code might help bring back what I have lost as part of the burnout.

Imagebot is one of those projects from the past. It is a website/image scraper written in Python, using the scraping framework, Scrapy.