%PDF- %PDF-
Direktori : /opt/alt/python311/lib64/python3.11/urllib/__pycache__/ |
Current File : //opt/alt/python311/lib64/python3.11/urllib/__pycache__/robotparser.cpython-311.pyc |
� c��f�$ � � � d Z ddlZddlZddlZdgZ ej dd� � Z G d� d� � Z G d� d� � Z G d � d � � Z dS )a% robotparser.py Copyright (C) 2000 Bastian Kleineidam You can choose between two licenses when using this package: 1) GNU GPLv2 2) PSF license for Python 2.2 The robots.txt Exclusion Protocol is implemented as specified in http://www.robotstxt.org/norobots-rfc.txt � N�RobotFileParser�RequestRatezrequests secondsc �\ � e Zd ZdZdd�Zd� Zd� Zd� Zd� Zd� Z d � Z d � Zd� Zd� Z d � Zd� ZdS )r zs This class provides a set of methods to read, parse and answer questions about a single robots.txt file. � c � � g | _ g | _ d | _ d| _ d| _ | � |� � d| _ d S )NFr )�entries�sitemaps� default_entry�disallow_all� allow_all�set_url�last_checked��self�urls �9/opt/alt/python311/lib64/python3.11/urllib/robotparser.py�__init__zRobotFileParser.__init__ sG � ������ �!���!���������S��������� c � � | j S )z�Returns the time the robots.txt file was last fetched. This is useful for long-running web spiders that need to check for new robots.txt files periodically. )r �r s r �mtimezRobotFileParser.mtime% s � � � � r c �@ � ddl }|� � � | _ dS )zYSets the time the robots.txt file was last fetched to the current time. r N)�timer )r r s r �modifiedzRobotFileParser.modified. s# � � ���� �I�I�K�K����r c �| � || _ t j � |� � dd� \ | _ | _ dS )z,Sets the URL referring to a robots.txt file.� � N)r �urllib�parse�urlparse�host�pathr s r r zRobotFileParser.set_url6 s4 � ����%�|�4�4�S�9�9�!�A�#�>��� �4�9�9�9r c � � t j � | j � � }|� � � }| � |� d� � � � � � � dS # t j j $ rK}|j dv rd| _ n)|j dk r|j dk rd| _ Y d}~dS Y d}~dS Y d}~dS Y d}~dS d}~ww xY w)z4Reads the robots.txt URL and feeds it to the parser.zutf-8)i� i� Ti� i� N) r �request�urlopenr �readr �decode� splitlines�error� HTTPError�coder r )r �f�raw�errs r r&