Giter Club home page Giter Club logo

easylist-pac-privoxy's People

Contributors

essandess avatar maxcountryman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

easylist-pac-privoxy's Issues

many warnings?

This is the outpet after running the file through python

Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.

#!/usr/bin/env python3

-- coding: utf-8 --

author = 'stsmith'

easylist_pac: Convert EasyList Tracker and Adblocking rules to an efficient Proxy Auto Configuration file

Copyright (C) 2017-2020 by Steven T. Smith , GPL

This program is free software: you can redistribute it and/or modify

it under the terms of the GNU General Public License as published by

the Free Software Foundation, either version 3 of the License, or

(at your option) any later version.

This program is distributed in the hope that it will be useful,

but WITHOUT ANY WARRANTY; without even the implied warranty of

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

GNU General Public License for more details.

You should have received a copy of the GNU General Public License

along with this program. If not, see http://www.gnu.org/licenses/.

import argparse as ap, copy, datetime, functools as fnt, numpy as np, os, re, sys, time, urllib.request, warnings

try:
machine_learning_flag = True
import multiprocessing as mp, scipy.sparse as sps
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
except ImportError as e:
machine_learning_flag = False
print(e)
warnings.warn("Install scikit-learn for more accurate EasyList rule selection.")

try:
plot_flag = True
import matplotlib as mpl, matplotlib.pyplot as plt
# Legible plot style defaults
# http://matplotlib.org/api/matplotlib_configuration_api.html
# http://matplotlib.org/users/customizing.html
mpl.rcParams['figure.figsize'] = (10.0, 5.0)
mpl.rc('font', **{'family': 'sans-serif', 'weight': 'bold', 'size': 14})
mpl.rc('axes', **{'titlesize': 20, 'titleweight': 'bold', 'labelsize': 16, 'labelweight': 'bold'})
mpl.rc('legend', **{'fontsize': 14})
mpl.rc('figure', **{'titlesize': 16, 'titleweight': 'bold'})
mpl.rc('lines', **{'linewidth': 2.5, 'markersize': 18, 'markeredgewidth': 0})
mpl.rc('mathtext',
**{'fontset': 'custom', 'rm': 'sans:bold', 'bf': 'sans:bold', 'it': 'sans:italic', 'sf': 'sans:bold',
'default': 'it'})
# plt.rc('text',usetex=False) # [default] usetex should be False
mpl.rcParams['text.latex.preamble'] = [r'\usepackage{amsmath,sfmath} \boldmath']
except ImportError as e:
plot_flag = False
print(e)
warnings.warn("Install matplotlib to plot rule priorities.")

class EasyListPAC:
'''Create a Proxy Auto Configuration file from EasyList rule sets.'''

def __init__(self):
    self.parseArgs()
    self.easylists_download_latest()
    self.parse_and_filter_rule_files()
    self.prioritize_rules()
    if not self.my_extra_rules_off:
        self.easylist_append_rules(my_extra_rules)
    if self.debug:
        print("Good rules and strengths:\n" + '\n'.join('{: 5d}:\t{}\t\t[{:2.1f}]'.format(i,r,s) for (i,(r,s)) in enumerate(zip(self.good_rules,self.good_signal))))
        print("\nBad rules and strengths:\n" + '\n'.join('{: 5d}:\t{}\t\t[{:2.1f}]'.format(i,r,s) for (i,(r,s)) in enumerate(zip(self.bad_rules,self.bad_signal))))
        if plot_flag:
            # plt.plot(np.arange(len(self.good_signal)), self.good_signal, '.')
            # plt.show()
            plt.plot(np.arange(len(self.bad_signal)), self.bad_signal, '.')
            plt.xlabel('Rule index')
            plt.ylabel('Bad rule distance (logit)')
            plt.show()
        return
    self.parse_easylist_rules()
    self.create_pac_file()

def parseArgs(self):
    # blackhole specification in arguments
    # best choice is the LAN IP address of the http://hostname/proxy.pac web server or a dedicated blackhole server, e.g. 192.168.0.2:8119
    parser = ap.ArgumentParser()
    parser.add_argument('-b', '--blackhole', help="Blackhole IP:port", type=str, default='127.0.0.1:8119')
    parser.add_argument('-d', '--download-dir', help="Download directory", type=str, default='~/Downloads')
    parser.add_argument('-g', '--debug', help="Debug: Just print rules", action='store_true')
    parser.add_argument('-moff', '--my_extra_rules_turnoff_flag', help="Turn off adding my extra rules", default=False, action='store_true')
    parser.add_argument('-p', '--proxy', help="Proxy host:port", type=str, default='')
    parser.add_argument('-P', '--PAC-original', help="Original proxy.pac file", type=str, default='proxy.pac.orig')
    parser.add_argument('-rb', '--bad-rule-max', help="Maximum number of bad rules (-1 for unlimited)", type=int,
                        default=19999)
    parser.add_argument('-rg', '--good-rule-max', help="Maximum number of good rules (-1 for unlimited)",
                        type=int, default=1099)
    parser.add_argument('-th', '--truncate_hash', help="Truncate hash object length to maximum number", type=int,
                        default=3999)
    parser.add_argument('-tr', '--truncate_regex', help="Truncate regex rules to maximum number", type=int,
                        default=499)
    parser.add_argument('-w', '--sliding-window', help="Sliding window training and test (slow)", action='store_true')
    parser.add_argument('-x', '--Extra_EasyList_URLs', help="Extra Easylsit URLs", type=str, nargs='+', default=[])
    parser.add_argument('-*', '--wildcard-limit', help="Limit the number of wildcards", type=int, default=999)
    parser.add_argument('-@@', '--exceptions_include_flag', help="Include exception rules", action='store_true')
    args = parser.parse_args()
    self.args = parser.parse_args()
    self.blackhole_ip_port = args.blackhole
    self.easylist_dir = os.path.expanduser(args.download_dir)
    self.debug = args.debug
    self.my_extra_rules_off = args.my_extra_rules_turnoff_flag
    self.proxy_host_port = args.proxy
    self.orig_pac_file = os.path.join(self.easylist_dir, args.PAC_original)
    # n.b. negative limits are set to no limits using [:None] slicing trick
    self.good_rule_max = args.good_rule_max if args.good_rule_max >= 0 else None
    self.bad_rule_max = args.bad_rule_max if args.bad_rule_max >= 0 else None
    self.truncate_hash_max = args.truncate_hash if args.truncate_hash >= 0 else None
    self.truncate_alternatives_max = args.truncate_regex if args.truncate_regex >= 0 else None
    self.sliding_window = args.sliding_window
    self.exceptions_include_flag = args.exceptions_include_flag
    self.wildcard_named_group_limit = args.wildcard_limit if args.wildcard_limit >= 0 else None
    self.extra_easylist_urls = args.Extra_EasyList_URLs
    return self.args

def easylists_download_latest(self):
    easylist_url = 'https://easylist.to/easylist/easylist.txt'
    easyprivacy_url = 'https://easylist.to/easylist/easyprivacy.txt'
    fanboy_annoyance_url = 'https://easylist.to/easylist/fanboy-annoyance.txt'
    fanboy_antifacebook = 'https://raw.githubusercontent.com/ryanbr/fanboy-adblock/master/fanboy-antifacebook.txt'
    self.download_list = [fanboy_antifacebook, fanboy_annoyance_url, easyprivacy_url, easylist_url] + self.extra_easylist_urls
    self.file_list = []
    for url in self.download_list:
        fname = os.path.basename(url)
        fname_full = os.path.join(self.easylist_dir, fname)
        file_utc = file_to_utc(fname_full) if os.path.isfile(os.path.join(self.easylist_dir, fname)) else 0.
        resp = urllib.request.urlopen(urllib.request.Request(url, headers={'User-Agent': user_agent}))
        url_utc = last_modified_to_utc(last_modified_resp(resp))
        if (url_utc > file_utc) or (os.path.getsize(fname_full) == 0):  # download the newer file
            with open(fname_full, mode='w', encoding='utf-8') as out_file:
                out_file.write(resp.read().decode('utf-8'))
        self.file_list.append(fname_full)

def parse_and_filter_rule_files(self):
    """Parse all rules into good and bad lists. Use flags to specify included/excluded rules."""
    self.good_rules = []
    self.bad_rules = []
    self.good_opts = []
    self.bad_opts = []
    self.good_rules_include_flag = []
    self.bad_rules_include_flag = []
    for file in self.file_list:
        with open(file, 'r', encoding='utf-8') as fd:
            self.easylist_append_rules(fd)

def easylist_append_rules(self, fd):
    """Append EasyList rules from file to good and bad lists."""
    for line in fd:
        line = line.rstrip()
        try:
            self.easylist_append_one_rule(line)
        except self.RuleIgnored as e:
            if self.debug: print(e,flush=True)
            continue

class RuleIgnored(Exception):
    pass

def easylist_append_one_rule(self, line):
    """Append EasyList rules from line to good and bad lists."""
    ignore_rules_flag = False
    ignored_rules_count = 0
    line_orig = line
    # configuration lines and selector rules should already be filtered out
    if re_test(configuration_re, line) or re_test(selector_re, line): raise self.RuleIgnored("Rule '{}' not added.".format(line))
    exception_flag = exception_filter(line)  # block default; pass if True
    line = exception_re.sub(r'\1', line)
    option_exception_re = not3dimppuposgh_option_exception_re  # ignore these options by default
    # delete all easylist options **prior** to regex and selector cases
    # ignore domain limits for now
    opts = ''  # default: no options in the rule
    if re_test(option_re, line):
        opts = option_re.sub(r'\2', line)
        # domain-specific and other option exceptions: ignore
        # too many rules (>~ 10k) bog down the browser; make reasonable exclusions here
        line = option_re.sub(r'\1', line)  # delete all the options and continue
    # ignore these cases
    # comment case: ignore
    if re_test(comment_re, line):
        if re_test(commentname_sections_ignore_re, line):
            ignored_rules_comment_start = comment_re.sub('', line)
            if not ignore_rules_flag:
                ignored_rules_count = 0
                ignore_rules_flag = True
                print('Ignore rules following comment ', end='', flush=True)
            print('"{}"… '.format(ignored_rules_comment_start), end='', flush=True)
        else:
            if ignore_rules_flag: print('\n {:d} rules ignored.'.format(ignored_rules_count), flush=True)
            ignored_rules_count = 0
            ignore_rules_flag = False
        raise self.RuleIgnored("Rule '{}' not added.".format(line))
    if ignore_rules_flag:
        ignored_rules_count += 1
        self.append_rule(exception_flag, line, opts, False)
        raise self.RuleIgnored("Rule '{}' not added.".format(line))
    # blank url case: ignore
    if re_test(httpempty_re, line): raise self.RuleIgnored("Rule '{}' not added.".format(line))
    # blank line case: ignore
    if not bool(line): raise self.RuleIgnored("Rule '{}' not added.".format(line))
    # block default or pass exception
    if exception_flag:
        option_exception_re = not3dimppuposgh_option_exception_re  # ignore these options within exceptions
        if not self.exceptions_include_flag:
            self.append_rule(exception_flag, line, opts, False)
            raise self.RuleIgnored("Rule '{}' not added.".format(line))
    # specific options: ignore
    if re_test(option_exception_re, opts):
        self.append_rule(exception_flag, line, opts, False)
        raise self.RuleIgnored("Rule '{}' not added.".format(line))
    # add all remaining rules
    self.append_rule(exception_flag, line, opts, True)

def append_rule(self,exception_flag,rule, opts, include_rule_flag):
    if not bool(rule): return  # last chance to reject blank lines -- shouldn't happen
    if exception_flag:
        self.good_rules.append(rule)
        self.good_opts.append(option_tokenizer(opts))
        self.good_rules_include_flag.append(include_rule_flag)
    else:
        self.bad_rules.append(rule)
        self.bad_opts.append(option_tokenizer(opts))
        self.bad_rules_include_flag.append(include_rule_flag)

def good_class_test(self,rule,opts=''):
    return not bool(badregex_regex_filters_re.search(rule))

def bad_class_test(self,rule,opts=''):
    """Bad rule of interest if a match for the bad regex's or specific rule options,

e.g. non-domain specific popups or images."""
return bool(badregex_regex_filters_re.search(rule))
or (bool(opts) and bool(thrdp_im_pup_os_option_re.search(opts))
and not bool(not3dimppupos_option_exception_re.search(opts)))

def prioritize_rules(self):
    # use bootstrap regex preferences
    # https://github.com/seatgeek/fuzzywuzzy would be great here if there were such a thing for regex
    self.good_signal = np.array([self.good_class_test(x,opts) for (x,opts,f) in zip(self.good_rules,self.good_opts,self.good_rules_include_flag) if f], dtype=np.int)
    self.bad_signal = np.array([self.bad_class_test(x,opts) for (x,opts,f) in zip(self.bad_rules,self.bad_opts,self.bad_rules_include_flag) if f], dtype=np.int)

    self.good_columns = np.array([i for (i,f) in enumerate(self.good_rules_include_flag) if f],dtype=int)
    self.bad_columns = np.array([i for (i,f) in enumerate(self.bad_rules_include_flag) if f],dtype=int)

    # Logistic Regression for more accurate rule priorities
    if machine_learning_flag:
        print("Performing logistic regression on rule sets. This will take a few minutes…",end='',flush=True)
        self.logreg_priorities()
        print(" done.", flush=True)

        # truncate to positive signal strengths
        if not self.debug:
            self.good_rule_max = min(self.good_rule_max,np.count_nonzero(self.good_signal > 0)) \
                if isinstance(self.good_rule_max,(int,np.int)) else np.count_nonzero(self.good_signal > 0)
            self.bad_rule_max = min(self.bad_rule_max, np.count_nonzero(self.bad_signal > 0)) \
                if isinstance(self.bad_rule_max,(int,np.int)) else np.count_nonzero(self.bad_signal > 0)

    # prioritize and limit the rules
    good_pridx = np.array([e[0] for e in sorted(enumerate(self.good_signal),key=lambda e: e[1],reverse=True)],dtype=int)[:self.good_rule_max]
    self.good_columns = self.good_columns[good_pridx]
    self.good_signal = self.good_signal[good_pridx]
    self.good_rules = [self.good_rules[k] for k in self.good_columns]
    bad_pridx = np.array([e[0] for e in sorted(enumerate(self.bad_signal),key=lambda e: e[1],reverse=True)],dtype=int)[:self.bad_rule_max]
    self.bad_columns = self.bad_columns[bad_pridx]
    self.bad_signal = self.bad_signal[bad_pridx]
    self.bad_rules = [self.bad_rules[k] for k in self.bad_columns]

    # include hardcoded rules
    for rule in include_these_good_rules:
        if rule not in self.good_rules: self.good_rules.append(rule)
    for rule in include_these_bad_rules:
        if rule not in self.bad_rules: self.bad_rules.append(rule)

    # rules are now ordered
    self.good_columns = np.arange(0,len(self.good_rules),dtype=self.good_columns.dtype)
    self.bad_columns = np.arange(0,len(self.bad_rules),dtype=self.bad_columns.dtype)

    return

def logreg_priorities(self):
    """Rule prioritization using logistic regression on bootstrap preferences."""
    self.good_fv_json = {}
    self.good_column_hash = {}
    for col, (rule,opts) in enumerate(zip(self.good_rules,self.good_opts)):
        feature_vector_append_column(rule, opts, col, self.good_fv_json)
        self.good_column_hash[rule] = col
    self.bad_fv_json = {}
    self.bad_column_hash = {}
    for col, (rule,opts) in enumerate(zip(self.bad_rules,self.bad_opts)):
        feature_vector_append_column(rule, opts, col, self.bad_fv_json)
        self.bad_column_hash[rule] = col

    self.good_fv_mat, self.good_row_hash = fv_to_mat(self.good_fv_json, self.good_rules)
    self.bad_fv_mat, self.bad_row_hash = fv_to_mat(self.bad_fv_json, self.bad_rules)

    self.good_X_all = StandardScaler(with_mean=False).fit_transform(self.good_fv_mat.astype(np.float))
    self.good_y_all = np.array([self.good_class_test(x,opts) for (x,opts) in zip(self.good_rules, self.good_opts)], dtype=np.int)

    self.bad_X_all = StandardScaler(with_mean=False).fit_transform(self.bad_fv_mat.astype(np.float))
    self.bad_y_all = np.array([self.bad_class_test(x,opts) for (x,opts) in zip(self.bad_rules, self.bad_opts)], dtype=np.int)

    self.logit_fit_method_sample_weights()

    # inverse regularization signal; smaller values give more sparseness, less model rigidity
    self.C = 1.e1

    self.logreg_test_in_training()
    if self.sliding_window: self.logreg_sliding_window()

    return

def debug_feature_vector(self,rule_substring=r'google.com/pagead'):
    for j, rule in enumerate(self.bad_rules):
        if rule.find(rule_substring) >= 0: break
    col = j
    print(self.bad_rules[col])
    _, rows = self.bad_fv_mat[col,:].nonzero()  # fv_mat is transposed
    print(rows)
    for row in rows:
        print('Row {:d}: {}:: {:g}'.format(row, self.bad_row_hash[int(row)], self.bad_fv_mat[col, row]))

def logit_fit_method_sample_weights(self):
    # weights for LogisticRegression.fit()
    self.good_w_all = np.ones(len(self.good_y_all))
    self.bad_w_all = np.ones(len(self.bad_y_all))

    # add more weight for each of these regex matches
    for i, rule in enumerate(self.bad_rules):
        self.bad_w_all[i] += 1/max(1,len(rule))  # slight disadvantage for longer rules
        for regex in high_weight_regex:
            self.bad_w_all[i] += len(regex.findall(rule))
        # these options have more weight
        self.bad_w_all[i] += bool(thrdp_im_pup_os_option_re.search(self.bad_opts[i]))
    return

def logreg_test_in_training(self):
    """fast, initial method: test vectors in the training data"""

    self.good_fv_logreg = LogisticRegression(C=self.C, penalty='l2', solver='liblinear', tol=0.01)
    self.bad_fv_logreg = LogisticRegression(C=self.C, penalty='l2', solver='liblinear', tol=0.01)

    good_x_test = self.good_X_all[self.good_columns]
    good_X = self.good_X_all
    good_y = self.good_y_all
    good_w = self.good_w_all

    bad_x_test = self.bad_X_all[self.bad_columns]
    bad_X = self.bad_X_all
    bad_y = self.bad_y_all
    bad_w = self.bad_w_all

    if good_x_test.shape[0] > 0:
        self.good_fv_logreg.fit(good_X, good_y, sample_weight=good_w)
        self.good_signal = self.good_fv_logreg.decision_function(good_x_test)
    if bad_x_test.shape[0] > 0:
        self.bad_fv_logreg.fit(bad_X, bad_y, sample_weight=bad_w)
        self.bad_signal = self.bad_fv_logreg.decision_function(bad_x_test)
    return

def logreg_sliding_window(self):
    """bootstrap the signal strengths by removing test vectors from training"""

    # pre-prioritize using test-in-target values and limit the rules
    if not self.debug:
        good_preidx = np.array([e[0] for e in sorted(enumerate(self.good_signal),key=lambda e: e[1],reverse=True)],dtype=int)[:int(np.ceil(1.4*self.good_rule_max))]
        self.good_columns = self.good_columns[good_preidx]
        bad_preidx = np.array([e[0] for e in sorted(enumerate(self.bad_signal),key=lambda e: e[1],reverse=True)],dtype=int)[:int(np.ceil(1.4*self.bad_rule_max))]
        self.bad_columns = self.bad_columns[bad_preidx]

    # multithreaded loop for speed
    use_blocked_not_sklearn_mp = True  # it's a lot faster to block it yourself
    if use_blocked_not_sklearn_mp:
        # init w/ target-in-training results
        good_fv_logreg = copy.deepcopy(self.good_fv_logreg)
        good_fv_logreg.penalty = 'l2'
        good_fv_logreg.solver = 'sag'
        good_fv_logreg.warm_start = True
        good_fv_logreg.n_jobs = 1  # achieve parallelism via block processing
        bad_fv_logreg = copy.deepcopy(self.bad_fv_logreg)
        bad_fv_logreg.penalty = 'l2'
        bad_fv_logreg.solver = 'sag'
        bad_fv_logreg.warm_start = True
        bad_fv_logreg.n_jobs = 1  # achieve parallelism via block processing
        if False:  # debug mp: turn off multiprocessing with a monkeypatch
            class NotAMultiProcess(mp.Process):
                def start(self): self.run()
                def join(self): pass
            mp.Process = NotAMultiProcess

        # this is probably efficient with Linux's copy-on-write fork(); unsure about BSD/macOS
        # must refactor to use shared Array() [along with warm_start coeff's] to ensure
        # see https://stackoverflow.com/questions/5549190/is-shared-readonly-data-copied-to-different-processes-for-python-multiprocessing/

        # distribute training and tests across multiprocessors
        def training_op(queue, X_all, y_all, w_all, fv_logreg, columns, column_block):
            """Training and test operation put into a mp.Queue.
            columns[column_block] and signal[column_block] are the rule columns and corresponding signal strengths
            """
            res = np.zeros(len(column_block))
            for k in range(len(column_block)):
                mask = np.zeros(len(y_all), dtype=bool)
                mask[columns[column_block[k]]] = True
                mask = np.logical_not(mask)

                x_test = X_all[np.logical_not(mask)]
                X = X_all[mask]
                y = y_all[mask]
                w = w_all[mask]

                fv_logreg.fit(X, y, sample_weight=w)
                res[k] = fv_logreg.decision_function(x_test)[0]
            queue.put((column_block,res))  # signal[column_block] = res
            return

        num_threads = mp.cpu_count()

        # good
        q = mp.Queue()
        jobs = []
        self.good_signal = np.zeros(len(self.good_columns))
        block_length = len(self.good_columns) // num_threads
        column_block = np.arange(0, block_length)
        while len(column_block) > 0:
            column_block = column_block[np.where(column_block < len(self.good_columns))]
            fv_logreg = copy.deepcopy(good_fv_logreg)  # each process gets its own .coeff_'s
            column_block_copy = np.copy(column_block)  # each process gets its own block of columns
            p = mp.Process(target=training_op, args=(q, self.good_X_all, self.good_y_all, self.good_w_all, fv_logreg, self.good_columns, column_block_copy))
            p.start()
            jobs.append(p)
            column_block += len(column_block)
        # process the results in the queue
        for i in range(len(jobs)):
            column_block, res = q.get()
            self.good_signal[column_block] = res
        # join all jobs and wait for them to complete
        for p in jobs: p.join()

        # bad
        q = mp.Queue()
        jobs = []
        self.bad_signal = np.zeros(len(self.bad_columns))
        block_length = len(self.bad_columns) // num_threads
        column_block = np.arange(0, block_length)
        while len(column_block) > 0:
            column_block = column_block[np.where(column_block < len(self.bad_columns))]
            fv_logreg = copy.deepcopy(bad_fv_logreg)   # each process gets its own .coeff_'s
            column_block_copy = np.copy(column_block)  # each process gets its own block of columns
            p = mp.Process(target=training_op, args=(q, self.bad_X_all, self.bad_y_all, self.bad_w_all, fv_logreg, self.bad_columns, column_block_copy))
            p.start()
            jobs.append(p)
            column_block += len(column_block)
        # process the results in the queue
        for i in range(len(jobs)):
            column_block, res = q.get()
            self.bad_signal[column_block] = res
        # join all jobs and wait for them to complete
        for p in jobs: p.join()
    else:  # if use_blocked_not_sklearn_mp:
        def training_op(X_all, y_all, w_all, fv_logreg, columns, signal):
            """Training and test operations reusing results with multiprocessing."""
            res = np.zeros(len(signal))
            for k in range(len(res)):
                mask = np.zeros(len(y_all), dtype=bool)
                mask[columns[k]] = True
                mask = np.logical_not(mask)

                x_test = X_all[np.logical_not(mask)]
                X = X_all[mask]
                y = y_all[mask]
                w = w_all[mask]

                fv_logreg.fit(X, y, sample_weight=w)
                res[k] = fv_logreg.decision_function(x_test)[0]
            signal[:] = res
            return
        # good
        training_op(self.good_X_all, self.good_y_all, self.good_w_all, self.good_fv_logreg, self.good_columns, self.good_signal)
        # bad
        training_op(self.bad_X_all, self.bad_y_all, self.bad_w_all, self.bad_fv_logreg, self.bad_columns, self.bad_signal)
    return

def parse_easylist_rules(self):
    for rule in self.good_rules: self.easylist_to_javascript_vars(rule)
    for rule in self.bad_rules: self.easylist_to_javascript_vars(rule)
    ordered_unique_all_js_var_lists()
    return

def easylist_to_javascript_vars(self,rule,ignore_huge_url_regex_rule_list=False):
    rule = rule.rstrip()
    rule_orig = rule
    exception_flag = exception_filter(rule)  # block default; pass if True
    rule = exception_re.sub(r'\1', rule)
    option_exception_re = not3dimppuposgh_option_exception_re  # ignore these options by default
    opts = ''  # default: no options in the rule
    if re_test(option_re, rule):
        opts = option_re.sub(r'\2', rule)
        # domain-specific and other option exceptions: ignore
        # too many rules (>~ 10k) bog down the browser; make reasonable exclusions here
        rule = option_re.sub(r'\1', rule)  # delete all the options and continue
    # ignore these cases
    # comment case: ignore
    if re_test(comment_re, rule): return
    # block default or pass exception
    if exception_flag:
        option_exception_re = not3dimppuposgh_option_exception_re  # ignore these options within exceptions
        if not self.exceptions_include_flag: return
    # specific options: ignore
    if re_test(option_exception_re, opts): return
    # blank url case: ignore
    if re_test(httpempty_re, rule): return
    # blank line case: ignore
    if not rule: return
    # treat each of the these cases separately, here and in Javascript
    # regex case
    if re_test(regex_re, rule):
        if regex_ignore_test(rule): return
        rule = regex_re.sub(r'\1', rule)
        if exception_flag:
            good_url_regex.append(rule)
        else:
            if not re_test(badregex_regex_filters_re,
                           rule): return  # limit bad regex's to those in the filter
            bad_url_regex.append(rule)
        return
    # now that regex's are handled, delete unnecessary wildcards, e.g. /.../*
    rule = wildcard_begend_re.sub(r'\1', rule)
    # domain anchors, || or '|http://a.b' -> domain anchor 'a.b' for regex efficiency in JS
    if re_test(domain_anch_re, rule) or re_test(scheme_anchor_re, rule):
        # strip off initial || or |scheme://
        if re_test(domain_anch_re, rule):
            rule = domain_anch_re.sub(r'\1', rule)
        elif re_test(scheme_anchor_re, rule):
            rule = scheme_anchor_re.sub("", rule)
        # host subcase
        if re_test(da_hostonly_re, rule):
            rule = da_hostonly_re.sub(r'\1', rule)
            if not re_test(wild_anch_sep_exc_re, rule):  # exact subsubcase
                if not re_test(badregex_regex_filters_re, rule):
                    return  # limit bad regex's to those in the filter
                if exception_flag:
                    good_da_host_exact.append(rule)
                else:
                    bad_da_host_exact.append(rule)
                return
            else:  # regex subsubcase
                if regex_ignore_test(rule): return
                if exception_flag:
                    good_da_host_regex.append(rule)
                else:
                    if not re_test(badregex_regex_filters_re,
                                   rule): return  # limit bad regex's to those in the filter
                    bad_da_host_regex.append(rule)
                return
        # hostpath subcase
        if re_test(da_hostpath_re, rule):
            rule = da_hostpath_re.sub(r'\1', rule)
            if not re_test(wild_sep_exc_noanch_re, rule) and re_test(pathend_re, rule):  # exact subsubcase
                rule = re.sub(r'\|$', '', rule)  # strip EOL anchors
                if not re_test(badregex_regex_filters_re, rule):
                    return  # limit bad regex's to those in the filter
                if exception_flag:
                    good_da_hostpath_exact.append(rule)
                else:
                    bad_da_hostpath_exact.append(rule)
                return
            else:  # regex subsubcase
                if regex_ignore_test(rule): return
                # ignore option rules for some regex rules
                if re_test(alloption_exception_re, opts): return
                if exception_flag:
                    good_da_hostpath_regex.append(rule)
                else:
                    if not re_test(badregex_regex_filters_re,
                                   rule): return  # limit bad regex's to those in the filter
                    bad_da_hostpath_regex.append(rule)
                return
        # hostpathquery default case
        if True:
            # if re_test(re.compile(r'^go\.'),rule):
            #     pass
            if regex_ignore_test(rule): return
            if exception_flag:
                good_da_regex.append(rule)
            else:
                bad_da_regex.append(rule)
            return
    # all other non-regex patterns
    if True:
        if regex_ignore_test(rule): return
        if not ignore_huge_url_regex_rule_list:
            if re_test(alloption_exception_re, opts): return
            if exception_flag:
                good_url_parts.append(rule)
            else:
                if not re_test(badregex_regex_filters_re,
                               rule): return  # limit bad regex's to those in the filter
                bad_url_parts.append(rule)
            return  # superfluous return

def create_pac_file(self):
    self.proxy_pac_init()
    self.proxy_pac = self.proxy_pac_preamble \
                + "\n".join(["// " + l for l in self.easylist_strategy.split("\n")]) \
                + self.js_init_object('good_da_host_exact') \
                + self.js_init_regexp('good_da_host_regex', True) \
                + self.js_init_object('good_da_hostpath_exact') \
                + self.js_init_regexp('good_da_hostpath_regex', True) \
                + self.js_init_regexp('good_da_regex', True) \
                + self.js_init_object('good_da_host_exceptions_exact') \
                + self.js_init_object('bad_da_host_exact') \
                + self.js_init_regexp('bad_da_host_regex', True) \
                + self.js_init_object('bad_da_hostpath_exact') \
                + self.js_init_regexp('bad_da_hostpath_regex', True) \
                + self.js_init_regexp('bad_da_regex', True) \
                + self.js_init_regexp('good_url_parts') \
                + self.js_init_regexp('bad_url_parts') \
                + self.js_init_regexp('good_url_regex', regex_flag=True) \
                + self.js_init_regexp('bad_url_regex', regex_flag=True) \
                + self.proxy_pac_postamble

    for l in ['good_da_host_exact',
              'good_da_host_regex',
              'good_da_hostpath_exact',
              'good_da_hostpath_regex',
              'good_da_regex',
              'good_da_host_exceptions_exact',
              'bad_da_host_exact',
              'bad_da_host_regex',
              'bad_da_hostpath_exact',
              'bad_da_hostpath_regex',
              'bad_da_regex',
              'good_url_parts',
              'bad_url_parts',
              'good_url_regex',
              'bad_url_regex']:
        print("{}: {:d} rules".format(l, len(globals()[l])), flush=True)

    with open(os.path.join(self.easylist_dir, 'proxy.pac'), 'w', encoding='utf-8') as fd:
        fd.write(self.proxy_pac)

def proxy_pac_init(self):
    self.pac_proxy = 'PROXY {}'.format(self.proxy_host_port) if self.proxy_host_port else 'DIRECT'

    # define a default, user-supplied FindProxyForURL function
    self.default_FindProxyForURL_function = '''\

function FindProxyForURL(url, host)
{
if (
isPlainHostName(host) ||
shExpMatch(host, "10.") ||
shExpMatch(host, "172.16.
") ||
shExpMatch(host, "192.168.") ||
shExpMatch(host, "127.
") ||
dnsDomainIs(host, ".local") || dnsDomainIs(host, ".LOCAL")
)
return "DIRECT";
else if (
/*
Proxy bypass hostnames
/
/

Fix iOS 13 PAC file issue with Mail.app
See: https://forums.developer.apple.com/thread/121928
*/
// Apple
(host == "imap.mail.me.com") || (host == "smtp.mail.me.com") ||
dnsDomainIs(host, "imap.mail.me.com") || dnsDomainIs(host, "smtp.mail.me.com") ||
(host == "p03-imap.mail.me.com") || (host == "p03-smtp.mail.me.com") ||
dnsDomainIs(host, "p03-imap.mail.me.com") || dnsDomainIs(host, "p03-smtp.mail.me.com") ||
(host == "p66-imap.mail.me.com") || (host == "p66-smtp.mail.me.com") ||
dnsDomainIs(host, "p66-imap.mail.me.com") || dnsDomainIs(host, "p66-smtp.mail.me.com") ||
// Google
(host == "imap.gmail.com") || (host == "smtp.gmail.com") ||
dnsDomainIs(host, "imap.gmail.com") || dnsDomainIs(host, "smtp.gmail.com") ||
// Yahoo
(host == "imap.mail.yahoo.com") || (host == "smtp.mail.yahoo.com") ||
dnsDomainIs(host, "imap.mail.yahoo.com") || dnsDomainIs(host, "smtp.mail.yahoo.com") ||
// Comcast
(host == "imap.comcast.net") || (host == "smtp.comcast.net") ||
dnsDomainIs(host, "imap.comcast.net") || dnsDomainIs(host, "smtp.comcast.net") ||
// Apple Enterprise Network Domains; https://support.apple.com/en-us/HT210060
(host == "albert.apple.com") || dnsDomainIs(host, "albert.apple.com") ||
(host == "captive.apple.com") || dnsDomainIs(host, "captive.apple.com") ||
(host == "gs.apple.com") || dnsDomainIs(host, "gs.apple.com") ||
(host == "humb.apple.com") || dnsDomainIs(host, "humb.apple.com") ||
(host == "static.ips.apple.com") || dnsDomainIs(host, "static.ips.apple.com") ||
(host == "tbsc.apple.com") || dnsDomainIs(host, "tbsc.apple.com") ||
(host == "time-ios.apple.com") || dnsDomainIs(host, "time-ios.apple.com") ||
(host == "time.apple.com") || dnsDomainIs(host, "time.apple.com") ||
(host == "time-macos.apple.com") || dnsDomainIs(host, "time-macos.apple.com") ||
dnsDomainIs(host, ".push.apple.com") ||
(host == "gdmf.apple.com") || dnsDomainIs(host, "gdmf.apple.com") ||
(host == "deviceenrollment.apple.com") || dnsDomainIs(host, "deviceenrollment.apple.com") ||
(host == "deviceservices-external.apple.com") || dnsDomainIs(host, "deviceservices-external.apple.com") ||
(host == "identity.apple.com") || dnsDomainIs(host, "identity.apple.com") ||
(host == "iprofiles.apple.com") || dnsDomainIs(host, "iprofiles.apple.com") ||
(host == "mdmenrollment.apple.com") || dnsDomainIs(host, "mdmenrollment.apple.com") ||
(host == "setup.icloud.com") || dnsDomainIs(host, "setup.icloud.com") ||
(host == "appldnld.apple.com") || dnsDomainIs(host, "appldnld.apple.com") ||
(host == "gg.apple.com") || dnsDomainIs(host, "gg.apple.com") ||
(host == "gnf-mdn.apple.com") || dnsDomainIs(host, "gnf-mdn.apple.com") ||
(host == "gnf-mr.apple.com") || dnsDomainIs(host, "gnf-mr.apple.com") ||
(host == "gs.apple.com") || dnsDomainIs(host, "gs.apple.com") ||
(host == "ig.apple.com") || dnsDomainIs(host, "ig.apple.com") ||
(host == "mesu.apple.com") || dnsDomainIs(host, "mesu.apple.com") ||
(host == "oscdn.apple.com") || dnsDomainIs(host, "oscdn.apple.com") ||
(host == "osrecovery.apple.com") || dnsDomainIs(host, "osrecovery.apple.com") ||
(host == "skl.apple.com") || dnsDomainIs(host, "skl.apple.com") ||
(host == "swcdn.apple.com") || dnsDomainIs(host, "swcdn.apple.com") ||
(host == "swdist.apple.com") || dnsDomainIs(host, "swdist.apple.com") ||
(host == "swdownload.apple.com") || dnsDomainIs(host, "swdownload.apple.com") ||
(host == "swpost.apple.com") || dnsDomainIs(host, "swpost.apple.com") ||
(host == "swscan.apple.com") || dnsDomainIs(host, "swscan.apple.com") ||
(host == "updates-http.cdn-apple.com") || dnsDomainIs(host, "updates-http.cdn-apple.com") ||
(host == "updates.cdn-apple.com") || dnsDomainIs(host, "updates.cdn-apple.com") ||
(host == "xp.apple.com") || dnsDomainIs(host, "xp.apple.com") ||
dnsDomainIs(host, ".itunes.apple.com") ||
dnsDomainIs(host, ".apps.apple.com") ||
dnsDomainIs(host, ".mzstatic.com") ||
(host == "ppq.apple.com") || dnsDomainIs(host, "ppq.apple.com") ||
(host == "lcdn-registration.apple.com") || dnsDomainIs(host, "lcdn-registration.apple.com") ||
(host == "crl.apple.com") || dnsDomainIs(host, "crl.apple.com") ||
(host == "crl.entrust.net") || dnsDomainIs(host, "crl.entrust.net") ||
(host == "crl3.digicert.com") || dnsDomainIs(host, "crl3.digicert.com") ||
(host == "crl4.digicert.com") || dnsDomainIs(host, "crl4.digicert.com") ||
(host == "ocsp.apple.com") || dnsDomainIs(host, "ocsp.apple.com") ||
(host == "ocsp.digicert.com") || dnsDomainIs(host, "ocsp.digicert.com") ||
(host == "ocsp.entrust.net") || dnsDomainIs(host, "ocsp.entrust.net") ||
(host == "ocsp.verisign.net") || dnsDomainIs(host, "ocsp.verisign.net") ||
// Zoom
dnsDomainIs(host, ".zoom.us")
)
return "PROXY localhost:3128";
else
return "PROXY localhost:3128";
}
'''

    if os.path.isfile(self.orig_pac_file):
        with open(self.orig_pac_file, 'r', encoding='utf-8') as fd:
            self.original_FindProxyForURL_function = fd.read()
    else:
        self.original_FindProxyForURL_function = self.default_FindProxyForURL_function

    # change last 'return "PROXY ..."' to 'return EasyListFindProxyForURL(url, host)'
    def re_sub_last(pattern, repl, string, **kwargs): 
        '''re.sub on the last match in a string'''
        # ensure that pattern is grouped
        # (note that (?:) is not caught)
        pattern_grouped = pattern if bool(re.match(r'\(.+\)',pattern)) else r'({})'.format(pattern)
        spl = re.split(pattern_grouped, string, **kwargs) 
        if len(spl) == 1: return string 
        spl[-2] = re.sub(pattern, repl, spl[-2], **kwargs)
        return ''.join(spl)
    self.original_FindProxyForURL_function = re_sub_last(r'return[\s]+"PROXY[^"]+"', 'return EasyListFindProxyForURL(url, host)',
                                           self.original_FindProxyForURL_function)

    #  proxy.pac preamble
    self.calling_command = ' '.join([os.path.basename(sys.argv[0])] + sys.argv[1:])
    self.proxy_pac_preamble = '''\

// PAC (Proxy Auto Configuration) Filter from EasyList rules
//
// Copyright (C) 2017 by Steven T. Smith , GPL
// https://github.com/essandess/easylist-pac-privoxy/
//
// PAC file created on {}
// Created with command: {}
//
// http://www.gnu.org/licenses/lgpl.txt
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with this program. If not, see http://www.gnu.org/licenses/.

// If you normally use a proxy, replace "DIRECT" below with
// "PROXY MACHINE:PORT"
// where MACHINE is the IP address or host name of your proxy
// server and PORT is the port number of your proxy server.
//
// Influenced in part by code from King of the PAC from http://securemecca.com/pac.html

// Define the blackhole proxy for blocked adware and trackware

var normal = "DIRECT";
var proxy = "{}"; // e.g. 127.0.0.1:3128
// var blackhole_ip_port = "127.0.0.1:8119"; // ngnix-hosted blackhole
// var blackhole_ip_port = "8.8.8.8:53"; // GOOG DNS blackhole; do not use: no longer works with iOS 11—causes long waits on some sites
var blackhole_ip_port = "{}"; // on iOS a working blackhole requires return code 200;
// e.g. use the adblock2privoxy nginx server as a blackhole
var blackhole = "PROXY " + blackhole_ip_port;

// The hostnames must be consistent with EasyList format.
// These special RegExp characters will be escaped below: [.?+@]
// This EasyList wildcard will be transformed to an efficient RegExp: *
//
// EasyList format references:
// https://adblockplus.org/filters
// https://adblockplus.org/filter-cheatsheet

// Create object hashes or compile efficient NFA's from all filters
// Various alternate filtering and regex approaches were timed using node and at jsperf.com

// Too many rules (>~ 10k) bog down the browser; make reasonable exclusions here:

'''.format(time.strftime("%a, %d %b %Y %X GMT", time.gmtime()),self.calling_command,self.pac_proxy,self.blackhole_ip_port)

    self.proxy_pac_postamble = '''

// Add any good networks here. Format is network folowed by a comma and
// optional white space, and then the netmask.
// LAN, loopback, Apple (direct and Akamai e.g. e4805.a.akamaiedge.net), Microsoft (updates and services)
// Apple Enterprise Network; https://support.apple.com/en-us/HT210060
var GoodNetworks_Array = [ "10.0.0.0, 255.0.0.0",
"172.16.0.0, 255.240.0.0",
"17.248.128.0, 255.255.192.0",
"17.250.64.0, 255.255.192.0",
"17.248.192.0, 255.255.224.0",
"192.168.0.0, 255.255.0.0",
"127.0.0.0, 255.0.0.0",
"17.0.0.0, 255.0.0.0",
"23.2.8.68, 255.255.255.255",
"23.2.145.78, 255.255.255.255",
"23.39.179.17, 255.255.255.255",
"23.63.98.0, 255.255.254.0",
"104.70.71.223, 255.255.255.255",
"104.73.77.224, 255.255.255.255",
"104.96.184.235, 255.255.255.255",
"104.96.188.194, 255.255.255.255",
"65.52.0.0, 255.255.252.0" ];

// Apple iAd, Microsoft telemetry
var GoodNetworks_Exceptions_Array = [ "17.172.28.11, 255.255.255.255",
"134.170.30.202, 255.255.255.255",
"137.116.81.24, 255.255.255.255",
"157.56.106.189, 255.255.255.255",
"184.86.53.99, 255.255.255.255",
"2.22.61.43, 255.255.255.255",
"2.22.61.66, 255.255.255.255",
"204.79.197.200, 255.255.255.255",
"23.218.212.69, 255.255.255.255",
"65.39.117.230, 255.255.255.255",
"65.52.108.33, 255.255.255.255",
"65.55.108.23, 255.255.255.255",
"64.4.54.254, 255.255.255.255" ];

// Akamai: 23.64.0.0/14, 23.0.0.0/12, 23.32.0.0/11, 104.64.0.0/10

// Add any bad networks here. Format is network folowed by a comma and
// optional white space, and then the netmask.
// From securemecca.com: Adobe marketing cloud, 2o7, omtrdc, Sedo domain parking, flyingcroc, accretive
var BadNetworks_Array = [ "61.139.105.128, 255.255.255.192",
"63.140.35.160, 255.255.255.248",
"63.140.35.168, 255.255.255.252",
"63.140.35.172, 255.255.255.254",
"63.140.35.174, 255.255.255.255",
"66.150.161.32, 255.255.255.224",
"66.235.138.0, 255.255.254.0",
"66.235.141.0, 255.255.255.0",
"66.235.143.48, 255.255.255.254",
"66.235.143.64, 255.255.255.254",
"66.235.153.16, 255.255.255.240",
"66.235.153.32, 255.255.255.248",
"81.31.38.0, 255.255.255.128",
"82.98.86.0, 255.255.255.0",
"89.185.224.0, 255.255.224.0",
"207.66.128.0, 255.255.128.0" ];

// block these schemes; use the command line for ftp, rsync, etc. instead
var bad_schemes_RegExp = RegExp("^(?:ftp|sftp|tftp|ftp-data|rsync|finger|gopher)", "i")

// RegExp for schemes; lengths from
// perl -lane 'BEGIN{$l=0;} {!/^#/ && do{$ll=length($F[0]); if($ll>$l){$l=$ll;}};} END{print $l;}' /etc/services
var schemepart_RegExp = RegExp("^([\\w*+-]{2,15}):\\/{0,2}","i");
var hostpart_RegExp = RegExp("^((?:[\\w-]+\\.)+[a-zA-Z0-9-]{2,24}\\.?)", "i");
var querypart_RegExp = RegExp("^((?:[\\w-]+\\.)+[a-zA-Z0-9-]{2,24}\\.?[\\w~%.\\/^-])(\\??\\S*?)$", "i");
var domainpart_RegExp = RegExp("^(?:[\\w-]+\\.)*((?:[\\w-]+\\.)[a-zA-Z0-9-]{2,24})\\.?", "i");

//////////////////////////////////////////////////
// Define the is_ipv4_address function and vars //
//////////////////////////////////////////////////

var ipv4_RegExp = /^(\d{1,3}).(\d{1,3}).(\d{1,3}).(\d{1,3})$/;

function is_ipv4_address(host)
{
var ipv4_pentary = host.match(ipv4_RegExp);
var is_valid_ipv4 = false;

if (ipv4_pentary) {
    is_valid_ipv4 = true;
    for( i = 1; i <= 4; i++) {
        if (ipv4_pentary[i] >= 256) {
            is_valid_ipv4 = false;
        }
    }
}
return is_valid_ipv4;

}

// object hashes
// Note: original stackoverflow-based hasOwnProperty does not woth within iOS kernel
var hasOwnProperty = function(obj, prop) {
return obj.hasOwnProperty(prop);
}

/////////////////////
// Done Setting Up //
/////////////////////

// debug with Chrome at chrome://net-export
// alert("Debugging message.")

//////////////////////////////////
// Define the FindProxyFunction //
//////////////////////////////////

var use_pass_rules_parts_flag = true; // use the pass rules for url parts, then apply the block rules
var alert_flag = false; // use for short-circuit '&&' to print debugging statements
var debug_flag = false; // use for short-circuit '&&' to print debugging statements

// EasyList filtering for FindProxyForURL(url, host)
function EasyListFindProxyForURL(url, host)
{
var host_is_ipv4 = is_ipv4_address(host);
var host_ipv4_address;

alert_flag && alert("url is: " + url);
alert_flag && alert("host is: " + host);

// Extract scheme and url without scheme
var scheme = url.match(schemepart_RegExp)
scheme = scheme.length > 0? scheme[1] : "";

// Remove the scheme and extract the path for regex efficiency
var url_noscheme = url.replace(schemepart_RegExp,"");
var url_pathonly = url_noscheme.replace(hostpart_RegExp,"");
var url_noquery = url_noscheme.replace(querypart_RegExp,"$1");
// Remove the server name from the url and host if host is not an IPv4 address
var url_noserver = !host_is_ipv4 ? url_noscheme.replace(domainpart_RegExp,"$1") : url_noscheme;
var url_noservernoquery = !host_is_ipv4 ? url_noquery.replace(domainpart_RegExp,"$1") : url_noscheme;
var host_noserver =  !host_is_ipv4 ? host.replace(domainpart_RegExp,"$1") : host;

// Debugging results
if (debug_flag && alert_flag) {
    alert("url_noscheme is: " + url_noscheme);
    alert("url_pathonly is: " + url_pathonly);
    alert("url_noquery is: " + url_noquery);
    alert("url_noserver is: " + url_noserver);
    alert("url_noservernoquery is: " + url_noservernoquery);
    alert("host_noserver is: " + host_noserver);
}

// Short circuit to blackhole for good_da_host_exceptions
if ( hasOwnProperty(good_da_host_exceptions_exact_JSON,host) ) {
    alert_flag && alert("good_da_host_exceptions_exact_JSON blackhole!");
    return blackhole;
}

///////////////////////////////////////////////////////////////////////
// Check to make sure we can get an IPv4 address from the given host //
// name.  If we cannot do that then skip the Networks tests.         //
///////////////////////////////////////////////////////////////////////

host_ipv4_address = host_is_ipv4 ? host : (isResolvable(host) ? dnsResolve(host) : false);

if (host_ipv4_address) {
    alert_flag && alert("host ipv4 address is: " + host_ipv4_address);
    /////////////////////////////////////////////////////////////////////////////
    // If the IP translates to one of the GoodNetworks_Array (with exceptions) //
    // we pass it because it is considered safe.                               //
    /////////////////////////////////////////////////////////////////////////////

    for (i in GoodNetworks_Exceptions_Array) {
        tmpNet = GoodNetworks_Exceptions_Array[i].split(/,\s*/);
        if (isInNet(host_ipv4_address, tmpNet[0], tmpNet[1])) {
            alert_flag && alert("GoodNetworks_Exceptions_Array Blackhole: " + host_ipv4_address);
            return blackhole;
        }
    }
    for (i in GoodNetworks_Array) {
        tmpNet = GoodNetworks_Array[i].split(/,\s*/);
        if (isInNet(host_ipv4_address, tmpNet[0], tmpNet[1])) {
            alert_flag && alert("GoodNetworks_Array PASS: " + host_ipv4_address);
            return proxy;
        }
    }

    ///////////////////////////////////////////////////////////////////////
    // If the IP translates to one of the BadNetworks_Array we fail it   //
    // because it is not considered safe.                                //
    ///////////////////////////////////////////////////////////////////////

    for (i in BadNetworks_Array) {
        tmpNet = BadNetworks_Array[i].split(/,\s*/);
        if (isInNet(host_ipv4_address, tmpNet[0], tmpNet[1])) {
            alert_flag && alert("BadNetworks_Array Blackhole: " + host_ipv4_address);
            return blackhole;
        }
    }
}

//////////////////////////////////////////////////////////////////////////////
// HTTPS: https scheme can only use domain information                      //
// unless PacHttpsUrlStrippingEnabled == false [Chrome] or                  //
// network.proxy.autoconfig_url.include_path == true [Firefox, about:config]              //
// E.g. on macOS:                                                           //
// defaults write com.google.Chrome PacHttpsUrlStrippingEnabled -bool false //
// Check setting at page chrome://policy                                    //
//////////////////////////////////////////////////////////////////////////////

// Assume browser has disabled path access if scheme is https and path is '/'
if ( scheme == "https" && url_pathonly == "/" ) {

    ///////////////////////////////////////////////////////////////////////
    // PASS LIST:   domains matched here will always be allowed.         //
    ///////////////////////////////////////////////////////////////////////

    if ( (good_da_host_exact_flag && (hasOwnProperty(good_da_host_exact_JSON,host_noserver)||hasOwnProperty(good_da_host_exact_JSON,host)))
        && !hasOwnProperty(good_da_host_exceptions_exact_JSON,host) ) {
            alert_flag && alert("HTTPS PASS: " + host + ", " + host_noserver);
        return proxy;
    }

    //////////////////////////////////////////////////////////
    // BLOCK LIST:	stuff matched here here will be blocked //
    //////////////////////////////////////////////////////////

    if ( (bad_da_host_exact_flag && (hasOwnProperty(bad_da_host_exact_JSON,host_noserver)||hasOwnProperty(bad_da_host_exact_JSON,host))) ) {
        alert_flag && alert("HTTPS blackhole: " + host + ", " + host_noserver);
        return blackhole;
    }
}

////////////////////////////////////////
// HTTPS and HTTP: full path analysis //
////////////////////////////////////////

if (scheme == "https" || scheme == "http") {

    ///////////////////////////////////////////////////////////////////////
    // PASS LIST:   domains matched here will always be allowed.         //
    ///////////////////////////////////////////////////////////////////////

    if ( !hasOwnProperty(good_da_host_exceptions_exact_JSON,host)
        && ((good_da_host_exact_flag && (hasOwnProperty(good_da_host_exact_JSON,host_noserver)||hasOwnProperty(good_da_host_exact_JSON,host))) ||  // fastest test first
            (use_pass_rules_parts_flag &&
                (good_da_hostpath_exact_flag && (hasOwnProperty(good_da_hostpath_exact_JSON,url_noservernoquery)||hasOwnProperty(good_da_hostpath_exact_JSON,url_noquery)) ) ||
                // test logic: only do the slower test if the host has a (non)suspect fqdn
                (good_da_host_regex_flag && (good_da_host_regex_RegExp.test(host_noserver)||good_da_host_regex_RegExp.test(host))) ||
                (good_da_hostpath_regex_flag && (good_da_hostpath_regex_RegExp.test(url_noservernoquery)||good_da_hostpath_regex_RegExp.test(url_noquery))) ||
                (good_da_regex_flag && (good_da_regex_RegExp.test(url_noserver)||good_da_regex_RegExp.test(url_noscheme))) ||
                (good_url_parts_flag && good_url_parts_RegExp.test(url)) ||
                (good_url_regex_flag && good_url_regex_RegExp.test(url)))) ) {
        return proxy;
    }

    //////////////////////////////////////////////////////////
    // BLOCK LIST:	stuff matched here here will be blocked //
    //////////////////////////////////////////////////////////
    // Debugging results
    if (debug_flag && alert_flag) {
        alert("hasOwnProperty(bad_da_host_exact_JSON," + host_noserver + "): " + (bad_da_host_exact_flag && hasOwnProperty(bad_da_host_exact_JSON,host_noserver)));
        alert("hasOwnProperty(bad_da_host_exact_JSON," + host + "): " + (bad_da_host_exact_flag && hasOwnProperty(bad_da_host_exact_JSON,host)));
        alert("hasOwnProperty(bad_da_hostpath_exact_JSON," + url_noservernoquery + "): " + (bad_da_hostpath_exact_flag && hasOwnProperty(bad_da_hostpath_exact_JSON,url_noservernoquery)));
        alert("hasOwnProperty(bad_da_hostpath_exact_JSON," + url_noquery + "): " + (bad_da_hostpath_exact_flag && hasOwnProperty(bad_da_hostpath_exact_JSON,url_noquery)));
        alert("bad_da_host_regex_RegExp.test(" + host_noserver + "): " + (bad_da_host_regex_flag && bad_da_host_regex_RegExp.test(host_noserver)));
        alert("bad_da_host_regex_RegExp.test(" + host + "): " + (bad_da_host_regex_flag && bad_da_host_regex_RegExp.test(host)));
        alert("bad_da_hostpath_regex_RegExp.test(" + url_noservernoquery + "): " + (bad_da_hostpath_regex_flag && bad_da_hostpath_regex_RegExp.test(url_noservernoquery)));
        alert("bad_da_hostpath_regex_RegExp.test(" + url_noquery + "): " + (bad_da_hostpath_regex_flag && bad_da_hostpath_regex_RegExp.test(url_noquery)));
        alert("bad_da_regex_RegExp.test(" + url_noserver + "): " + (bad_da_regex_flag && bad_da_regex_RegExp.test(url_noserver)));
        alert("bad_da_regex_RegExp.test(" + url_noscheme + "): " + (bad_da_regex_flag && bad_da_regex_RegExp.test(url_noscheme)));
        alert("bad_url_parts_RegExp.test(" + url + "): " + (bad_url_parts_flag && bad_url_parts_RegExp.test(url)));
        alert("bad_url_regex_RegExp.test(" + url + "): " + (bad_url_regex_flag && bad_url_regex_RegExp.test(url)));
    }

    if ( (bad_da_host_exact_flag && (hasOwnProperty(bad_da_host_exact_JSON,host_noserver)||hasOwnProperty(bad_da_host_exact_JSON,host))) ||  // fastest test first
        (bad_da_hostpath_exact_flag && (hasOwnProperty(bad_da_hostpath_exact_JSON,url_noservernoquery)||hasOwnProperty(bad_da_hostpath_exact_JSON,url_noquery)) ) ||
        // test logic: only do the slower test if the host has a (non)suspect fqdn
        (bad_da_host_regex_flag && (bad_da_host_regex_RegExp.test(host_noserver)||bad_da_host_regex_RegExp.test(host))) ||
        (bad_da_hostpath_regex_flag && (bad_da_hostpath_regex_RegExp.test(url_noservernoquery)||bad_da_hostpath_regex_RegExp.test(url_noquery))) ||
        (bad_da_regex_flag && (bad_da_regex_RegExp.test(url_noserver)||bad_da_regex_RegExp.test(url_noscheme))) ||
        (bad_url_parts_flag && bad_url_parts_RegExp.test(url)) ||
        (bad_url_regex_flag && bad_url_regex_RegExp.test(url)) ) {
        alert_flag && alert("Blackhole: " + url + ", " + host);
        return blackhole;
    }
}

// default pass
alert_flag && alert("Default PASS: " + url + ", " + host);
return proxy;

}

// User-supplied FindProxyForURL()
''' + self.original_FindProxyForURL_function

    self.easylist_strategy = """\

EasyList rules:
https://adblockplus.org/filters
https://adblockplus.org/filter-cheatsheet
https://opnsrce.github.io/javascript-performance-tip-precompile-your-regular-expressions
https://adblockplus.org/blog/investigating-filter-matching-algorithms

Strategies to convert EasyList rules to Javascript tests:

In general:

  1. Preference for performance over 1:1 EasyList functionality
  2. Limit number of rules to ~O(10k) to avoid computational burden on mobile devices
  3. Exact matches: use Object hashing (very fast); use efficient NFA RegExp's for all else
  4. Divide and conquer specific cases to avoid large RegExp's
  5. Based on testing code performance on an iPhone: mobile Safari, Chrome with System Activity Monitor.app
  6. Backstop these proxy.pac rules with Privoxy rules and a browser plugin

ignore any rules following comments with these strings, until the next non-ignorable comment

commentname_sections_ignore_re = r'(?:{})'.format('|'.join(re.sub(r'([.])','\.',x) for x in '''
gizmodo.in
shink.in
project-free-tv.li
vshare.eu
pencurimovie.ph
filmlinks4u.is
Spiegel.de
bento.de
German
French
Arabic
Armenian
Belarusian
Bulgarian
Chinese
Croatian
Czech
Danish
Dutch
Estonian
Finnish
Georgian
Greek
Hebrew
Hungarian
Icelandic
Indian
Indonesian
Italian
Japanese
Korean
Latvian
Lithuanian
Norwegian
Persian
Polish
Portuguese
Romanian
Russian
Serbian
Singaporean
Slovene
Slovak
Spanish
Swedish
Thai
Turkish
Ukranian
Ukrainian
Vietnamese
Gamestar.de
Focus.de
tvspielfilm.de
Prosieben
Wetter.com
Woxikon.de
Fanfiktion.de
boote-forum.de
comunio.de
planetsnow.de'''.split('\n')))

include these rules, no matter their priority

necessary to include desired rules that fall below the threshold for a reasonably-sized PAC

Refs: https://guardianapp.com/ios-app-location-report-sep2018.html

include_these_good_rules = []
include_these_bad_rules = [x for x in """
/securepubads.
||google.com/pagead
||facebook.com/plugins/*
||connect.facebook.com
||connect.facebook.net
||platform.twitter.com
||api.areametrics.com
||in.cuebiq.com
||et.intake.factual.com
||api.factual.com
||api.beaconsinspace.com
||api.huq.io
||m2m-api.inmarket.com
||mobileapi.mobiquitynetworks.com
||sdk.revealmobile.com
||api.safegraph.com
||incoming-data-sense360.s3.amazonaws.com
||ios-quinoa-personal-identify-prod.sense360eng.com
||ios-quinoa-events-prod.sense360eng.com
||ios-quinoa-high-frequency-events-prod.sense360eng.com
||v1.blueberry.cloud.databerries.com
||pie.wirelessregistry.com""".split('\n') if not bool(re.search(r'^\s*?(?:#|$)',x))]

regex's for highly weighted rules

high_weight_regex_strings = """
trac?k
beacon
stat[is]?
anal[iy]
goog
facebook
yahoo
amazon
adob
msn

2-grams

goog\S+?ad
amazon\S+?ad
yahoo\S+?ad
facebook\S+?ad
adob\S+?ad
msn\S+ad
doubleclick
cooki
twitter
krxd
pagead
syndicat
(?:\bad|ad\b)
securepub
static
\boas\b
ads
cdn
cloud
banner
financ
share
traffic
creativ
media
host
affil
^mob
data
your?
watch
survey
stealth
invisible
brand
site
merch
kli[kp]
clic?k
popup
log
assets
count
metric
score
event
tool
quant
chart
opti?m
partner
sponsor
affiliate"""

high_weight_regex = [re.compile(x,re.IGNORECASE) for x in high_weight_regex_strings.split('\n') if not bool(re.search(r'^\s*?(?:#|$)',x))]

regex to limit regex filters (bootstrapping in part from securemecca.com PAC regex keywords)

if False:
badregex_regex_filters = '' # Accept everything
else:
badregex_regex_filters = high_weight_regex_strings + '\n' + '''
cooki
pagead
syndicat
(?:\bad|ad\b)
cdn
cloud
banner
image
img
pop
game
free
financ
film
fast
farmville
fan
exp
share
cash
money
dollar
buck
dump
deal
daily
content
kick
down
file
video
score
partner
match
ifram
cam
widget
monk
rapid
platform
google
follow
shop
love
content
#^(\d{1,3})\.(\d{1,3})\.(\d{1,3}).(\d{1,3})$
#^([A-Za-z]{12}|[A-Za-z]{8}|[A-Za-z]{50})\.com$
smile
happy
traffic
dash
board
tube
torrent
down
creativ
host
affil
\.(biz|ru|tv|stream|cricket|online|racing|party|trade|webcam|science|win|accountant|loan|faith|cricket|date)
^mob
join
data
your?
watch
survey
stealth
invisible
social
brand
site
script
xchang
merch
kli(k|p)
clic?k
zip
invest
arstech
buzzfeed
imdb
twitter
baidu
yandex
youtube
ebay
discovercard
chase
hsbc
usbank
santander
kaspersky
symantec
brightcove
hidden
invisible
macromedia
flash
[^i]scan[^dy]
secret
skype
tsbbank
tunnel
ubs\.com
unblock
unlock
usaa\.com
usbank\.com
ustreas\.gov
ustreasury
verifiedbyvisa\.com
viagra
wachovia
wellsfargo\.com
westernunion
windowsupdate
plugin
nielsen
oas-config
oas\/oas
pix
video-plugin
videodownloader
visit
voxmedia\.com
vtrack\.php
w3track\.com
web_?ad
webiq
weblog
webtrek
webtrend
wget\.exe
widgets
winstart\.exe
winstart\.zip
wired\.com
ad-limits\.js
ad-manager
ad_engine
adx\.js
\.bat
\.bin
[^ck]anal[^_]
\.com/a\.gif
\.com/p\.gif
\.com\.au\/ads
\.cpl
[^bhmz]eros
\.exe
\.exe
\.msi
\.net\/p\.gif
\.pac
\.pdf
\.pdf\.exe
\.rar
\.scr
\.sh
transparent1x1\.gif
\/travidia
__utm\.js
whv2_001\.js
xtcore\.js
\.zip
sharethis\.com
stats\.wp\.com
[^i]crack
virgins\.com
\.xyz
shareasale\.com
financialcontent\.com
netdna-cdn\.com
gstatic\.com
taboola\.com
ooyala\.com
pinimg\.com
cloudfront\.net
d21rhj7n383afu
d19rpgkrjeba2z
outbrain\.com
themindcircle\.com
google-analytics\.com
nocookie\.net
jwpsrv\.com
doubleclick\.net
d2c8v52ll5s99u
d3qdfnco3bamip
yarn\.co
visura\.co
gatehousmedia\.com
imore\.com
openx\.net
gigya\.com
shopify\.com
tiqcdn\.com
criteo\.net
ntv\.io
getyarn\.io
d15zn84cat5tp0
d1pz6dax0t5mop
allinviews\.com
pinterest\.com
media\.net
selectmedia\.asia
jsdelivr\.net
pubmatic\.com
aurubis\.com
cloudflare\.com
blueconic\.net
krxd\.net
cdn-mw\.com
serving-sys\.com
openx\.net
segment\.com
viglink\.com
viafoura\.net
aolcdn\.net
shoofl\.tv
inq\.com
optimizely\.com
kinja-static\.com
d3926qxcw0e1bh
yieldmo\.com
indexww\.com
2mdn\.net
newrelic\.com
guim\.co\.uk
futurecdn\.net
vidible\.tv
vindicosuite\.com
fsdn\.com
cpanel\.net
perfectmarket\.com
about\.me
omnigroup\.com
lightboxcdn\.com
hotjar\.com
addthis\.com
art19\.com
lkqd\.net
mathtag\.com
dc8xl0ndzn2cb
d1z2jf7jlzjs58
chowstatic\.com
spokenlayer\.com
akamaized\.net
d2qi7ewimk4e2w
stickyadstv\.com
fastly\.net
ddkpmexz7bq23
newscgp\.com
privy\.com
aspnetcdn\.com
parsley\.com
demdex\.net
d3alqb8vzo7fun
netdna-ssl\.com
yottaa\.net
go-mpulse\.net
bkrtx\.com
crwdcntrl\.net
ggpht\.com
alamy\.com
spokeo\.com
d2gatte9o95jao
dawm7kda6y2v0
dwgyu36up6iuz
litix\.io
sail-horizon\.com
cnevids\.com
dz310nzuyimx0
skimresources\.com
jwpcdn\.com
dwin2\.com
htl\.bid
df80k0z3fi8zg
o0bg\.com
d8rk54i4mohrb
simplereach\.com
adsrvr\.com
vertamedia\.com
disqusads\.com
polipace\.com
jwplatform\.com
dianomi\.com
kinja-img\.com
marketingvideonow\.com
beachfrontmedia\.com
mfcreative\.com
msecdn\.com
syndetics\.com
keycdn\.com
uservoice\.com
ravenjs\.com
d1fc8wv8zag5ca
broaddoor\.com
d3s44e87wooplq
d2x3bkdslnxkuj
selectablemedia\.com
yldbt\.com
streamrail\.net
seriable\.com
thoughtco\.com
perimeterx\.net
owneriq\.net
ml314\.com
d1e9d0h8gakqc
dtcn\.com
trustarc\.com
licdn\.com
effectivemeasure\.net
list-manage\.com
mtvnservices\.com
npttech\.com
dc8na2hxrj29i
tubemogul\.com
d1lqe9temigv1p
dna8twue3dlxq
adroll\.com
googleadservices\.com
localytics\.com
gfx\.ms
adsensecustomsearchads\.com
upsellit\.com
parrable\.com
ads-twitter\.com
atlanticinsights\.com
pagefair\.com
areyouahuman\.com
custhelp\.com
turn\.com
connatix\.com
printfriendly\.com
scroll\.com
cybersource\.com
zergnet\.com
jsintegrity\.com
cedexis\.com
3lift\.com
onestore\.ms
mdpcdn\.com
iperceptions\.com
dotomi\.com
pardot\.com
marketo\.net
rfksrv\.com
adnxs\.com
shartethis\.com
d31qbv1cthcecs
douyfz3utcehi
scorecardresearch\.com
nonembed\.com
peer39\.com
d3p2jlw8pmhccg
dnkzzz1hlto79
zqtk\.net
cloudinary\.com
omtrdc\.net
d5nxst8fruw4z
d1p6rqiydn62x8
dmtracker\.com
dp8hsntg6do36
buysellads\.com
intercomcdn\.net
dpstvy7p9whsy
cpx\.to
b-cdn\.net
googlecommerce\.com
insightexpressai\.com
evidon\.com
footprint\.net
advertising\.com
specificmedia\.com
quantcount\.com
amgdgt\.com
bluekai\.com
smartclip\.net
azureedge\.net
iesnare\.com
medscape\.com
agkn\.com
cliipa\.com
digiday\.com
convertro\.com
linksynergy\.com
woobi\.com
adx1\.com
254a\.com
mediaforge\.com
videostat\.net
theadtech\.com
emxdgt\.com
acuityplatform\.com
header\.direct'''

badregex_regex_filters = '\n'.join(x for x in badregex_regex_filters.split('\n') if not bool(re.search(r'^\s*?(?:#|$)',x)))
badregex_regex_filters_re = re.compile(r'(?:{})'.format('|'.join(badregex_regex_filters.split('\n'))),re.IGNORECASE)

if name == "main":
res = EasyListPAC()

sys.exit()
SyntaxError: multiple statements found while compiling a single statement

= RESTART: C:/Users/me/Documents/WPy64-3830/notebooks/easylist_pac.py
Ignore rules following comment " ---------- German Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- French Specific Annoyances ----------"… Ignore rules following comment " ---------- French Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Arabic Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Chinese Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Croatian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Danish Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Dutch Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Finnish Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Greek Site Generic Hiding Rules ----------"… Ignore rules following comment " ---------- Hebrew Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Indian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Italian Site Specific Blocking Rules ----------"… Ignore rules following comment " ---------- Italian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Japanese Site Specific Rules ----------"… Ignore rules following comment " ---------- Korean Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Latvian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Norwegian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Polish Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Polish Site Specific Blocking Rules ----------"… Ignore rules following comment " ---------- Portuguese Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Romanian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Russian Site Specific Blocking Rules ----------"… Ignore rules following comment " ---------- Russian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Serbian Site Specific Blocking Rules ----------"… Ignore rules following comment " ---------- Spanish Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Swedish Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Turkish Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Ukranian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- German Site Specific Hiding Rules ----------"… Ignore rules following comment " Spiegel.de"… Ignore rules following comment " Focus.de"… Ignore rules following comment " Gamestar.de"… Ignore rules following comment " Focus.de"… Ignore rules following comment " tvspielfilm.de"… Ignore rules following comment " Wetter.com"… Ignore rules following comment " Woxikon.de"… Ignore rules following comment " comunio.de"… Ignore rules following comment " ---------- French Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Arabic Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Arabic Specific Media Elements ----------"… Ignore rules following comment " ---------- Chinese Specific Media Elements ----------"… Ignore rules following comment " ---------- Chinese Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Danish Specific Social Media Elements ----------"… Ignore rules following comment " ---------- Dutch Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Greek Specific Social Media Elements ----------"… Ignore rules following comment " ---------- Hebrew Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Hungarian Specific Media Elements ----------"… Ignore rules following comment " ---------- Indian Specific Social Media Elements ----------"… Ignore rules following comment " ---------- Indonesian Specific Social Media Elements ----------"… Ignore rules following comment " ---------- Italian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Japanese Specific Media Elements ----------"… Ignore rules following comment " ---------- Japanese Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Korean Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Latvian Specific Social Media Elements ----------"… Ignore rules following comment " ---------- Norwegian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Polish Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Portuguese Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Romanian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Russian Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Spanish Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Swedish Site Specific Hiding Rules ----------"… Ignore rules following comment " ---------- Turkish Specific Social Media Elements ----------"… Ignore rules following comment " ---------- German ----------"… Ignore rules following comment " ---------- French ----------"… Ignore rules following comment " ---------- Arabic ----------"… Ignore rules following comment " ---------- Bulgarian ----------"… Ignore rules following comment " ---------- Chinese ----------"… Ignore rules following comment " ---------- Croatian ----------"… Ignore rules following comment " ---------- Czech ----------"… Ignore rules following comment " ---------- Danish ----------"… Ignore rules following comment " ---------- Dutch ----------"… Ignore rules following comment " ---------- Estonian ----------"… Ignore rules following comment " ---------- Finnish ----------"… Ignore rules following comment " ---------- Greek ----------"… Ignore rules following comment " ---------- Hebrew ----------"… Ignore rules following comment " ---------- Hungarian ----------"… Ignore rules following comment " ---------- Icelandic ----------"… Ignore rules following comment " ---------- Indian ----------"… Ignore rules following comment " ---------- Italian ----------"… Ignore rules following comment " ---------- Japanese ----------"… Ignore rules following comment " ---------- Korean ----------"… Ignore rules following comment " ---------- Latvian ----------"… Ignore rules following comment " ---------- Lithuanian ---------"… Ignore rules following comment " ---------- Norwegian ----------"… Ignore rules following comment " ---------- Polish ----------"… Ignore rules following comment " ---------- Portuguese ----------"… Ignore rules following comment " ---------- Romanian ----------"… Ignore rules following comment " ---------- Russian ----------"… Ignore rules following comment " ---------- Serbian ----------"… Ignore rules following comment " ---------- Slovak ----------"… Ignore rules following comment " ---------- Spanish ----------"… Ignore rules following comment " ---------- Swedish ----------"… Ignore rules following comment " ---------- Thai ----------"… Ignore rules following comment " ---------- Turkish ----------"… Ignore rules following comment " ---------- Ukrainian ----------"… Ignore rules following comment " ---------- Vietnamese ----------"… Ignore rules following comment " ---------- German ----------"… Ignore rules following comment " ---------- French ----------"… Ignore rules following comment " ---------- Bulgarian ----------"… Ignore rules following comment " ---------- Chinese ----------"… Ignore rules following comment " ---------- Croatian ----------"… Ignore rules following comment " ---------- Czech ----------"… Ignore rules following comment " ---------- Danish ----------"… Ignore rules following comment " ---------- Dutch ----------"… Ignore rules following comment " ---------- Finnish ----------"… Ignore rules following comment " ---------- Greek ----------"… Ignore rules following comment " ---------- Hebrew ----------"… Ignore rules following comment " ---------- Hungarian ----------"… Ignore rules following comment " ---------- Icelandic ----------"… Ignore rules following comment " ---------- Indian ----------"… Ignore rules following comment " ---------- Italian ----------"… Ignore rules following comment " ---------- Japanese ----------"… Ignore rules following comment " ---------- Korean ----------"… Ignore rules following comment " ---------- Latvian ----------"… Ignore rules following comment " ---------- Norwegian ----------"… Ignore rules following comment " ---------- Polish ----------"… Ignore rules following comment " ---------- Portuguese ----------"… Ignore rules following comment " ---------- Romanian ----------"… Ignore rules following comment " ---------- Russian ----------"… Ignore rules following comment " ---------- Slovak ----------"… Ignore rules following comment " ---------- Spanish ----------"… Ignore rules following comment " ---------- Swedish ----------"… Ignore rules following comment " ---------- Thai ----------"… Ignore rules following comment " ---------- Turkish ----------"… Ignore rules following comment " ---------- Ukrainian ----------"… Ignore rules following comment " Russian rating sites"… Ignore rules following comment " German"… Ignore rules following comment " French"… Ignore rules following comment " Armenian"… Ignore rules following comment " Belarusian"… Ignore rules following comment " Bulgarian"… Ignore rules following comment " Chinese"… Ignore rules following comment " Croatian"… Ignore rules following comment " Czech"… Ignore rules following comment " Danish"… Ignore rules following comment " Dutch"… Ignore rules following comment " Estonian"… Ignore rules following comment " Finnish"… Ignore rules following comment " Greek"… Ignore rules following comment " Hebrew"… Ignore rules following comment " Hungarian"… Ignore rules following comment " Icelandic"… Ignore rules following comment " Indonesian"… Ignore rules following comment " Italian"… Ignore rules following comment " Japanese"… Ignore rules following comment " Korean"… Ignore rules following comment " Latvian"… Ignore rules following comment " Lithuanian"… Ignore rules following comment " Norwegian"… Ignore rules following comment " Persian"… Ignore rules following comment " Polish"… Ignore rules following comment " Portuguese"… Ignore rules following comment " Romanian"… Ignore rules following comment " Russian"… Ignore rules following comment " Serbian"… Ignore rules following comment " Slovak"… Ignore rules following comment " Spanish"… Ignore rules following comment " Swedish"… Ignore rules following comment " Thai"… Ignore rules following comment " Turkish"… Ignore rules following comment " Ukranian"… Ignore rules following comment " Vietnamese"… Ignore rules following comment " German"… Ignore rules following comment " Arabic"… Ignore rules following comment " French"… Ignore rules following comment " Belarusian"… Ignore rules following comment " Croatian"… Ignore rules following comment " Chinese"… Ignore rules following comment " Croatian"… Ignore rules following comment " Czech"… Ignore rules following comment " Danish"… Ignore rules following comment " Dutch"… Ignore rules following comment " Estonian"… Ignore rules following comment " Finnish"… Ignore rules following comment " Georgian"… Ignore rules following comment " Greek"… Ignore rules following comment " Hebrew"… Ignore rules following comment " Hungarian"… Ignore rules following comment " Icelandic"… Ignore rules following comment " Indian"… Ignore rules following comment " Indonesian"… Ignore rules following comment " Italian"… Ignore rules following comment " Japanese"… Ignore rules following comment " Korean"… Ignore rules following comment " Latvian"… Ignore rules following comment " Lithuanian"… Ignore rules following comment " Norwegian"… Ignore rules following comment " Persian"… Ignore rules following comment " Polish"… Ignore rules following comment " Portuguese"… Ignore rules following comment " Romanian"… Ignore rules following comment " Russian"… Ignore rules following comment " Serbian"… Ignore rules following comment " Slovak"… Ignore rules following comment " Slovene"… Ignore rules following comment " Spanish"… Ignore rules following comment " Swedish"… Ignore rules following comment " Thai"… Ignore rules following comment " Turkish"… Ignore rules following comment " Ukranian"… Ignore rules following comment " Vietnamese"… Ignore rules following comment " German"… Ignore rules following comment " Danish"… Ignore rules following comment " French"… Ignore rules following comment " Indian"… Ignore rules following comment " Arabic"… Ignore rules following comment " Persian / Farsi"… Ignore rules following comment " Bulgarian"… Ignore rules following comment " Chinese"… Ignore rules following comment " Croatian"… Ignore rules following comment " Czech"… Ignore rules following comment " Dutch"… Ignore rules following comment " Finnish"… Ignore rules following comment " Greek"… Ignore rules following comment " Hebrew"… Ignore rules following comment " Hungarian"… Ignore rules following comment " Italian"… Ignore rules following comment " Japanese"… Ignore rules following comment " Korean"… Ignore rules following comment " Latvian"… Ignore rules following comment " Norwegian"… Ignore rules following comment " Polish"… Ignore rules following comment " Portuguese"… Ignore rules following comment " Russian"… Ignore rules following comment " Serbian"… Ignore rules following comment " Slovene"… Ignore rules following comment " Spanish"… Ignore rules following comment " Swedish"… Ignore rules following comment " Thai"… Ignore rules following comment " Turkish"… Ignore rules following comment " Ukrainian"… Ignore rules following comment " Vietnamese"… Ignore rules following comment " Indonesian"… Ignore rules following comment " Gamestar.de"… Ignore rules following comment " Focus.de"… Ignore rules following comment " tvspielfilm.de"… Ignore rules following comment " Prosieben"… Ignore rules following comment " Wetter.com"… Ignore rules following comment " Woxikon.de"… Ignore rules following comment " Fanfiktion.de"… Ignore rules following comment " boote-forum.de"… Ignore rules following comment " comunio.de"… Ignore rules following comment " planetsnow.de"… Ignore rules following comment " ---------- German ----------"… Ignore rules following comment " ---------- French ----------"… Ignore rules following comment " ---------- Arabic ----------"… Ignore rules following comment " ---------- Bulgarian ----------"… Ignore rules following comment " ---------- Chinese ----------"… Ignore rules following comment " ---------- Czech ----------"… Ignore rules following comment " ---------- Danish ----------"… Ignore rules following comment " ---------- Dutch ----------"… Ignore rules following comment " ---------- Finnish ----------"… Ignore rules following comment " ---------- Hebrew ----------"… Ignore rules following comment " ---------- Hungarian ----------"… Ignore rules following comment " ---------- Italian ----------"… Ignore rules following comment " ---------- Indonesian ----------"… Ignore rules following comment " ---------- Japanese ----------"… Ignore rules following comment " ---------- Korean ----------"… Ignore rules following comment " ---------- Latvian ----------"… Ignore rules following comment " ---------- Norwegian ----------"… Ignore rules following comment " ---------- Polish ----------"… Ignore rules following comment " ---------- Portuguese ----------"… Ignore rules following comment " ---------- Romanian ----------"… Ignore rules following comment " ---------- Russian ----------"… Ignore rules following comment " ---------- Spanish ----------"… Ignore rules following comment " ---------- Swedish ----------"… Ignore rules following comment " ---------- Thai ----------"… Ignore rules following comment " ---------- Turkish ----------"… Ignore rules following comment " ---------- Ukrainian ----------"… Ignore rules following comment " adinsertion used on gizmodo.in lifehacker.co.in"… Ignore rules following comment " vshare.eu"… Ignore rules following comment " filmlinks4u.is"… Ignore rules following comment " Spiegel.de"… Ignore rules following comment " bento.de"… Ignore rules following comment " Healthy Advertising (Spanish)"… Performing logistic regression on rule sets. This will take a few minutes… done.

Warning (from warnings module):
File "C:/Users/me/Documents/WPy64-3830/notebooks/easylist_pac.py", line 1202
warnings.warn("Truncating regex alternatives rule set '{}' from {:d} to {:d}.".format(array_name,len(arr),self.truncate_alternatives_max))
UserWarning: Truncating regex alternatives rule set 'bad_da_hostpath_regex' from 1548 to 499.

Warning (from warnings module):
File "C:/Users/me/Documents/WPy64-3830/notebooks/easylist_pac.py", line 1202
warnings.warn("Truncating regex alternatives rule set '{}' from {:d} to {:d}.".format(array_name,len(arr),self.truncate_alternatives_max))
UserWarning: Truncating regex alternatives rule set 'bad_url_parts' from 7669 to 499.
good_da_host_exact: 110 rules
good_da_host_regex: 4 rules
good_da_hostpath_exact: 0 rules
good_da_hostpath_regex: 0 rules
good_da_regex: 0 rules
good_da_host_exceptions_exact: 39 rules
bad_da_host_exact: 3474 rules
bad_da_host_regex: 12 rules
bad_da_hostpath_exact: 551 rules
bad_da_hostpath_regex: 1549 rules
bad_da_regex: 150 rules
good_url_parts: 0 rules
bad_url_parts: 7670 rules
good_url_regex: 0 rules
bad_url_regex: 9 rules

for comparison it generated a 175 KB (179,762 bytes) file.....but one of the "original" ad blok pac files, is merely 51.8KB....seems this pac file primarily blocks exact domains? where the no ads pac relies heavily on regex (at http://www.schooner.com/~loverso/no-ads/) and FWIW hasn;t been updated sinceNovember of lastyear....

Just figured I'd give some constructive critisicm and was kind ofconfused with the "warnings" and"exceptions"

Appstore

When running easylist_pac.py the IOS Appstore gets blocked. Probably there is a whitelist rule needed

Integration

Wow this is really nice, although can you add support for it to be used with ublock and android ad blockers like below.
github.com/julian-klode/dns66
github.com/AdguardTeam/AdguardForAndroid
github.com/bitbeans/SimpleDnsCrypt - Under the filter section
Thanks

Safari 15 iOS Bypasses proxy.pac PROXY setting for HTTPS 400 code

The PROXY blackhole approach used in this repo has stopped working for HTTPS requests on all iOS after updating to Safari on iOS/iPadOS 15.

Most requests now are HTTPS, so this breaks functionality.

Safari 15 appears to bypass the proxy.pac PROXY and sends requests to https://unwarranted.tracker.website/?whatever.

I hypothesize the reason is that the proxy returns 400 for such HTTPS CONNECT requests. Its behavior, expected for HTTPS CONNECT requests,looks like:

curl -I --proxy http://my.blackhole.server:8119 https://unwarranted.tracker.website/?whatever
HTTP/1.1 400 Bad Request
Server: nginx/1.21.3
Date: Sat, 25 Sep 2021 19:17:07 GMT
Content-Type: text/html
Content-Length: 157
Connection: close

curl: (56) Received HTTP code 400 from proxy after CONNECT

The fix appears to be to deprecate this repo and use Privoxy’s HTTPS inspection along with adblock2privoxy.

Reference: https://developer.apple.com/forums/thread/691279

Error compiling easylist_pac.py under osx with python 3.6

Hi,
great programm- just what I was looking for ;-)
As I'm a totally newbi to python I just managed to install under osx (El Captain).
When I run your script I get there error messages.

Do you habe any suggestions?
Thx a lot

TTraceback (most recent call last):
File "/Users/peter/Documents/Python3_Scripts/Easylist-proxypac/easylist_pac Kopie ORIGINAL.py", line 2426, in
badregex_regex_filters_re = re.compile(r'(?:{})'.format('|'.join(badregex_regex_filters.split('\n'))),re.IGNORECASE)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/re.py", line 233, in compile
return _compile(pattern, flags)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/re.py", line 301, in _compile
p = sre_compile.compile(pattern, flags)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_compile.py", line 562, in compile
p = sre_parse.parse(p, flags)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 856, in parse
p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, False)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 415, in _parse_sub
itemsappend(_parse(source, state, verbose))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 763, in _parse
p = _parse_sub(source, state, sub_verbose)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 415, in _parse_sub
itemsappend(_parse(source, state, verbose))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 501, in _parse
code = _escape(source, this, state)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/sre_parse.py", line 401, in _escape
raise source.error("bad escape %s" % escape, len(escape))
sre_constants.error: bad escape \m at position 16919

AttributeError with -w and MatplotlibDeprecationWarning

Using py -3 easylist_pac.py -w on Windows 10 produces an error (Python version 3.7.8, Windows version 10.0.18362.1016):

Traceback (most recent call last):
  File "easylist_pac.py", line 2270, in <module>
    res = EasyListPAC()
  File "easylist_pac.py", line 63, in __init__
    self.prioritize_rules()
  File "easylist_pac.py", line 252, in prioritize_rules
    self.logreg_priorities()
  File "easylist_pac.py", line 312, in logreg_priorities
    if self.sliding_window: self.logreg_sliding_window()
  File "easylist_pac.py", line 432, in logreg_sliding_window
    p.start()
  File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Program Files\Python37\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Program Files\Python37\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'EasyListPAC.logreg_sliding_window.<locals>.training_op'

Then there is the MatplotlibDeprecationWarning (Matplotlib version 1.15.0):

easylist_pac.py:50: MatplotlibDeprecationWarning: Support for setting the 'text.latex.preamble' or 'pgf.preamble' rcParam to a list of strings is deprecated since 3.3 and will be removed two minor releases later; set it to a single string instead.
  mpl.rcParams['text.latex.preamble'] = [r'\\usepackage{amsmath,sfmath} \\boldmath']

I'm not familiar with using Matplotlib. @essandess

Banner ads, “Doubleclick” ads still appearing

Using the proxy.pac in iOS (Auto config under Wifi name) doesn’t block some ads like the ads.doubleclick ads seen on many popular (and unpopular) websites. It also slows internet connection to certain websites, Twitter for instance.

Testing efficacy

I realize that the .pac packages a subset of the rules, however so far I don't see anything being blocked.

Perhaps a concrete example, which would at least show the rules are working, would be helpful. (For instance, a specific site with known ads that are blocked by the .pac.) Can you suggest anything?

Blocking iOS Updates

Is it possible to modify this and have it block mesu.apple.com, appldnld.apple.com and gdmf.apple.com?

script fails with sci-libs/scikits_learn

easylist_pac.py says it is better to use scikits_learn to achieve better script effects

No module named 'sklearn'
easylist_pac.py:32: UserWarning: Install scikit-learn for more accurate EasyList rule selection.
  warnings.warn("Install scikit-learn for more accurate EasyList rule selection.")

When I installed it script has failed for me regardless of the version of scikits_learn used.

Fail with sci-libs/scikits_learn-0.17.1

Traceback (most recent call last):
  File "easylist_pac.py", line 27, in <module>
    from sklearn.linear_model import LogisticRegression
  File "/usr/lib64/python3.5/site-packages/sklearn/linear_model/__init__.py", line 15, in <module>
    from .least_angle import (Lars, LassoLars, lars_path, LarsCV, LassoLarsCV,
  File "/usr/lib64/python3.5/site-packages/sklearn/linear_model/least_angle.py", line 19, in <module>
    from scipy import linalg, interpolate
  File "/usr/lib64/python3.5/site-packages/scipy/interpolate/__init__.py", line 175, in <module>
    from .interpolate import *
  File "/usr/lib64/python3.5/site-packages/scipy/interpolate/interpolate.py", line 32, in <module>
    from .interpnd import _ndim_coords_from_arrays
  File "interpnd.pyx", line 1, in init scipy.interpolate.interpnd
  File "/usr/lib64/python3.5/site-packages/scipy/spatial/__init__.py", line 95, in <module>
    from .qhull import *
  File "qhull.pyx", line 2155, in init scipy.spatial.qhull
AttributeError: 'cython_function_or_method' object has no attribute '__func__'

Fail with sci-libs/scikits_learn-0.18.2-r1

Traceback (most recent call last):
  File "easylist_pac.py", line 27, in <module>
    from sklearn.linear_model import LogisticRegression
  File "/usr/lib64/python3.5/site-packages/sklearn/__init__.py", line 57, in <module>
    from .base import clone
  File "/usr/lib64/python3.5/site-packages/sklearn/base.py", line 12, in <module>
    from .utils.fixes import signature
  File "/usr/lib64/python3.5/site-packages/sklearn/utils/__init__.py", line 11, in <module>
    from .validation import (as_float_array,
  File "/usr/lib64/python3.5/site-packages/sklearn/utils/validation.py", line 18, in <module>
    from ..utils.fixes import signature
  File "/usr/lib64/python3.5/site-packages/sklearn/utils/fixes.py", line 403, in <module>
    from scipy.stats import rankdata
  File "/usr/lib64/python3.5/site-packages/scipy/stats/__init__.py", line 343, in <module>
    from .stats import *
  File "/usr/lib64/python3.5/site-packages/scipy/stats/stats.py", line 171, in <module>
    from . import distributions
  File "/usr/lib64/python3.5/site-packages/scipy/stats/distributions.py", line 10, in <module>
    from ._distn_infrastructure import (entropy, rv_discrete, rv_continuous,
  File "/usr/lib64/python3.5/site-packages/scipy/stats/_distn_infrastructure.py", line 16, in <module>
    from scipy.misc import doccer
  File "/usr/lib64/python3.5/site-packages/scipy/misc/__init__.py", line 67, in <module>
    from scipy.interpolate._pade import pade as _pade
  File "/usr/lib64/python3.5/site-packages/scipy/interpolate/__init__.py", line 175, in <module>
    from .interpolate import *
  File "/usr/lib64/python3.5/site-packages/scipy/interpolate/interpolate.py", line 32, in <module>
    from .interpnd import _ndim_coords_from_arrays
  File "interpnd.pyx", line 1, in init scipy.interpolate.interpnd
  File "/usr/lib64/python3.5/site-packages/scipy/spatial/__init__.py", line 95, in <module>
    from .qhull import *
  File "qhull.pyx", line 2155, in init scipy.spatial.qhull
AttributeError: 'cython_function_or_method' object has no attribute '__func__'

Fail with sci-libs/scikits_learn-0.19.0

Traceback (most recent call last):
  File "easylist_pac.py", line 27, in <module>
    from sklearn.linear_model import LogisticRegression
  File "/usr/lib64/python3.5/site-packages/sklearn/linear_model/__init__.py", line 12, in <module>
    from .base import LinearRegression
  File "/usr/lib64/python3.5/site-packages/sklearn/linear_model/base.py", line 38, in <module>
    from ..preprocessing.data import normalize as f_normalize
  File "/usr/lib64/python3.5/site-packages/sklearn/preprocessing/__init__.py", line 8, in <module>
    from .data import Binarizer
  File "/usr/lib64/python3.5/site-packages/sklearn/preprocessing/data.py", line 18, in <module>
    from scipy import stats
  File "/usr/lib64/python3.5/site-packages/scipy/stats/__init__.py", line 343, in <module>
    from .stats import *
  File "/usr/lib64/python3.5/site-packages/scipy/stats/stats.py", line 171, in <module>
    from . import distributions
  File "/usr/lib64/python3.5/site-packages/scipy/stats/distributions.py", line 10, in <module>
    from ._distn_infrastructure import (entropy, rv_discrete, rv_continuous,
  File "/usr/lib64/python3.5/site-packages/scipy/stats/_distn_infrastructure.py", line 16, in <module>
    from scipy.misc import doccer
  File "/usr/lib64/python3.5/site-packages/scipy/misc/__init__.py", line 67, in <module>
    from scipy.interpolate._pade import pade as _pade
  File "/usr/lib64/python3.5/site-packages/scipy/interpolate/__init__.py", line 175, in <module>
    from .interpolate import *
  File "/usr/lib64/python3.5/site-packages/scipy/interpolate/interpolate.py", line 32, in <module>
    from .interpnd import _ndim_coords_from_arrays
  File "interpnd.pyx", line 1, in init scipy.interpolate.interpnd
  File "/usr/lib64/python3.5/site-packages/scipy/spatial/__init__.py", line 95, in <module>
    from .qhull import *
  File "qhull.pyx", line 2155, in init scipy.spatial.qhull
AttributeError: 'cython_function_or_method' object has no attribute '__func__'

Just in case this might be importabt due to https://bugs.gentoo.org/630294 I installed scikits_learn with the use of sci-libs/gsl-2.4 instead of reference.

eselect cblas list
Installed CBLAS for library directory lib64
  [1]   gsl *
  [2]   reference

JavaScript Objects of Length about 1000 hang iOS iCloud-based apps and services on an iPad

I observe that PAC files with JavaScript objects of size 100 or so cause connectivity hangs in iOS iCloud-based apps and services on an iPad.

Mobile Safari and other browsers are not affected.

I believe that this is an iOS bug and I have filed issue #33093977.

Details:

  • This PAC file works with iOS iCloud: proxy_test_create.py 9 > proxy_test_9.pac
  • This PAC file doesn't work with iOS iCloud: proxy_test_create.py 999 > proxy_test_999.pac

proxy_test_create.py:

#!/usr/bin/env python

# Proxy Auto Configuration Generator for iOS
# Small values of n (less than 10) work; large (~1000) hangup iCloud and App Store access
# iOS Safari works with all values of n

# Syntax: python proxy_test_creator.py [size of JS objects]

import sys

# Get n from the first argument, with a small default value
n = int(sys.argv[1]) if len(sys.argv) > 1 else 9

print("""\
// Proxy Auto Configuration Example for iOS
// Small values of n (less than 10) work; large (100) hangup iCloud and App Store access

// n = {:d}

// JSON object
var jo = {{ {} }};
// Regular Expression
var re = /{}/i;

// Simple FindProxyForURL function
function FindProxyForURL(url, host) {{
	
	// Access the variables defined above, but ignore results
	jo.hasOwnProperty(host) || re.test(host);
	
	return "DIRECT";
}}
""".format(n,
", ".join('"field{:d}": null'.format(k+1) for k in range(n)),
"(?:" + "|".join('word{:d}'.format(k+1) for k in range(n)) + ")"))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.