Giter Club home page Giter Club logo

Comments (4)

crwhite14 avatar crwhite14 commented on May 27, 2024

Hi, we have successfully run this code on two different laptops, and on AWS r5.large, r5.xlarge, and p3.2xlarge instances. Can you share the specs of your computer, or can you try running the code on AWS?

from naszilla.

auroua avatar auroua commented on May 27, 2024

I add code to get the available system memory and comments some unnecessary code.

import psutil
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
tf.logging.set_verbosity(tf.logging.ERROR)


def run_experiments(args, save_dir):

    trials = args.trials
    out_file = args.output_filename
    metann_params = meta_neuralnet_params(args.search_space)
    algorithm_params = algo_params(args.algo_params)
    num_algos = len(algorithm_params)
    logging.info(algorithm_params)

    for i in range(trials):
        results = []
        walltimes = []

        for j in range(num_algos):
            # run NAS algorithm
            print('\n* Running algorithm: {}'.format(algorithm_params[j]))
            print('\n* Trials: {}, Free Memory available {}'.format(i, psutil.virtual_memory().free/(1024*1024)))
            starttime = time.time()
            algo_result = run_nas_algorithm(algorithm_params[j], metann_params)
            algo_result = np.round(algo_result, 5)

            # add walltime and results
            walltimes.append(time.time()-starttime)
            results.append(algo_result)

        # print and pickle results
        filename = os.path.join(save_dir, '{}_{}.pkl'.format(out_file, i))
        print('\n* Trial summary: (params, results, walltimes)')
        # print(algorithm_params)
        # print(metann_params)
        # print(results)
        # print(walltimes)
        # print('\n* Saving to file {}'.format(filename))
        # with open(filename, 'wb') as f:
        #     pickle.dump([algorithm_params, metann_params, results, walltimes], f)
        #     f.close()

As you can see from the outputs. Each trials will occupy around 780M system memory, and my computer system memory is 32G. So after some trials my computer system memory is out. I think this code contains some memory leak.

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 0, Free Memory available 23304.01953125

Query 20, Meta neural net train error: 0.6706451761627197
Query 20, top 5 val losses [6.801, 6.891, 6.961, 6.988, 7.455]
Query 30, Meta neural net train error: 0.24904882892608637
Query 30, top 5 val losses [6.09, 6.11, 6.153, 6.297, 6.37]
Query 40, Meta neural net train error: 0.2359766121419271
Query 40, top 5 val losses [5.719, 5.729, 5.906, 5.973, 6.03]
Query 50, Meta neural net train error: 0.3074175923156738
Query 50, top 5 val losses [5.442, 5.596, 5.719, 5.729, 5.746]
Query 60, Meta neural net train error: 0.19907020365905756
Query 60, top 5 val losses [5.442, 5.549, 5.596, 5.709, 5.719]
Query 70, Meta neural net train error: 0.19963256510416666
Query 70, top 5 val losses [5.442, 5.549, 5.596, 5.709, 5.719]
Query 80, Meta neural net train error: 0.2413669417790004
Query 80, top 5 val losses [5.442, 5.549, 5.592, 5.596, 5.709]
Query 90, Meta neural net train error: 0.2929446891307831
Query 90, top 5 val losses [5.442, 5.482, 5.549, 5.592, 5.596]
Query 100, Meta neural net train error: 0.1805207757737901
Query 100, top 5 val losses [5.442, 5.482, 5.549, 5.592, 5.596]
Query 110, Meta neural net train error: 0.22210936895751954
Query 110, top 5 val losses [5.362, 5.442, 5.482, 5.549, 5.576]
Query 120, Meta neural net train error: 0.1752352670634876
Query 120, top 5 val losses [5.362, 5.442, 5.482, 5.482, 5.482]
Query 130, Meta neural net train error: 0.1668418809445699
Query 130, top 5 val losses [5.362, 5.442, 5.482, 5.482, 5.482]
Query 140, Meta neural net train error: 0.358147247766348
Query 140, top 5 val losses [5.362, 5.442, 5.482, 5.482, 5.482]
Query 150, Meta neural net train error: 0.2213685071454729
Query 150, top 5 val losses [5.362, 5.442, 5.482, 5.482, 5.482]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 1, Free Memory available 16322.4921875
Query 20, Meta neural net train error: 0.39267581130981444
Query 20, top 5 val losses [5.652, 6.036, 6.083, 6.21, 6.477]
Query 30, Meta neural net train error: 0.34255362514495846
Query 30, top 5 val losses [5.652, 5.793, 5.906, 6.036, 6.063]
Query 40, Meta neural net train error: 0.2500657961527507
Query 40, top 5 val losses [5.652, 5.793, 5.906, 6.036, 6.063]
Query 50, Meta neural net train error: 0.24763826026916508
Query 50, top 5 val losses [5.606, 5.652, 5.682, 5.763, 5.793]
Query 60, Meta neural net train error: 0.21709816799926762
Query 60, top 5 val losses [5.606, 5.652, 5.682, 5.763, 5.769]
Query 70, Meta neural net train error: 0.14204328908284503
Query 70, top 5 val losses [5.606, 5.652, 5.682, 5.704, 5.763]
Query 80, Meta neural net train error: 0.2670334483446393
Query 80, top 5 val losses [5.606, 5.652, 5.682, 5.704, 5.729]
Query 90, Meta neural net train error: 0.2646169803714752
Query 90, top 5 val losses [5.559, 5.606, 5.652, 5.682, 5.704]
Query 100, Meta neural net train error: 0.23479244693332246
Query 100, top 5 val losses [5.442, 5.559, 5.606, 5.652, 5.682]
Query 110, Meta neural net train error: 0.299368987663269
Query 110, top 5 val losses [5.442, 5.559, 5.606, 5.652, 5.682]
Query 120, Meta neural net train error: 0.18222559431596236
Query 120, top 5 val losses [5.362, 5.442, 5.559, 5.589, 5.606]
Query 130, Meta neural net train error: 0.15951242336908977
Query 130, top 5 val losses [5.362, 5.442, 5.549, 5.559, 5.589]
Query 140, Meta neural net train error: 0.25150803740868205
Query 140, top 5 val losses [5.362, 5.442, 5.549, 5.559, 5.589]
Query 150, Meta neural net train error: 0.21424750920431954
Query 150, top 5 val losses [5.362, 5.442, 5.549, 5.559, 5.589]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 2, Free Memory available 15626.890625
Query 20, Meta neural net train error: 0.24437462715148936
Query 20, top 5 val losses [6.03, 6.05, 6.083, 6.106, 6.116]
Query 30, Meta neural net train error: 0.20739694820404048
Query 30, top 5 val losses [5.592, 6.03, 6.05, 6.083, 6.106]
Query 40, Meta neural net train error: 0.28068012634277345
Query 40, top 5 val losses [5.592, 5.826, 5.879, 5.996, 6.03]
Query 50, Meta neural net train error: 0.2798204266738892
Query 50, top 5 val losses [5.592, 5.826, 5.829, 5.836, 5.879]
Query 60, Meta neural net train error: 0.15683800877380377
Query 60, top 5 val losses [5.592, 5.816, 5.826, 5.829, 5.836]
Query 70, Meta neural net train error: 0.1719443667093913
Query 70, top 5 val losses [5.549, 5.592, 5.592, 5.599, 5.696]
Query 80, Meta neural net train error: 0.30628662278311597
Query 80, top 5 val losses [5.549, 5.592, 5.592, 5.599, 5.696]
Query 90, Meta neural net train error: 0.18919254566192628
Query 90, top 5 val losses [5.549, 5.592, 5.592, 5.599, 5.696]
Query 100, Meta neural net train error: 0.2033007059393989
Query 100, top 5 val losses [5.549, 5.562, 5.592, 5.592, 5.599]
Query 110, Meta neural net train error: 0.2649595273895264
Query 110, top 5 val losses [5.549, 5.559, 5.562, 5.592, 5.592]
Query 120, Meta neural net train error: 0.26201712097861546
Query 120, top 5 val losses [5.549, 5.559, 5.562, 5.592, 5.592]
Query 130, Meta neural net train error: 0.2266174406115214
Query 130, top 5 val losses [5.549, 5.559, 5.562, 5.592, 5.592]
Query 140, Meta neural net train error: 0.4055050408700797
Query 140, top 5 val losses [5.549, 5.559, 5.562, 5.592, 5.592]
Query 150, Meta neural net train error: 0.2354819589287894
Query 150, top 5 val losses [5.399, 5.549, 5.559, 5.562, 5.562]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 3, Free Memory available 14957.86328125
Query 20, Meta neural net train error: 0.38480514625549306
Query 20, top 5 val losses [6.12, 6.22, 6.337, 6.687, 6.801]
Query 30, Meta neural net train error: 0.42228067707061767
Query 30, top 5 val losses [6.036, 6.12, 6.2, 6.22, 6.297]
Query 40, Meta neural net train error: 0.24832236668904623
Query 40, top 5 val losses [5.853, 6.023, 6.03, 6.036, 6.12]
Query 50, Meta neural net train error: 0.3575978270149231
Query 50, top 5 val losses [5.853, 5.966, 6.023, 6.023, 6.03]
Query 60, Meta neural net train error: 0.2644575637054444
Query 60, top 5 val losses [5.589, 5.729, 5.853, 5.966, 6.023]
Query 70, Meta neural net train error: 0.28097091551462805
Query 70, top 5 val losses [5.382, 5.395, 5.442, 5.589, 5.682]
Query 80, Meta neural net train error: 0.35672260220118934
Query 80, top 5 val losses [5.382, 5.395, 5.442, 5.579, 5.589]
Query 90, Meta neural net train error: 0.23635694175720215
Query 90, top 5 val losses [5.382, 5.395, 5.442, 5.485, 5.579]
Query 100, Meta neural net train error: 0.2224740772247314
Query 100, top 5 val losses [5.382, 5.395, 5.442, 5.485, 5.579]
Query 110, Meta neural net train error: 0.37511226256561275
Query 110, top 5 val losses [5.382, 5.395, 5.442, 5.485, 5.579]
Query 120, Meta neural net train error: 0.24871182519392532
Query 120, top 5 val losses [5.382, 5.395, 5.442, 5.485, 5.579]
Query 130, Meta neural net train error: 0.24261404633839928
Query 130, top 5 val losses [5.382, 5.395, 5.442, 5.445, 5.485]
Query 140, Meta neural net train error: 0.324640991122906
Query 140, top 5 val losses [5.382, 5.395, 5.442, 5.445, 5.485]
Query 150, Meta neural net train error: 0.23449292458670482
Query 150, top 5 val losses [5.382, 5.395, 5.442, 5.445, 5.485]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 4, Free Memory available 14175.98046875
Query 20, Meta neural net train error: 0.19777381866455082
Query 20, top 5 val losses [6.15, 6.233, 6.237, 6.29, 6.54]
Query 30, Meta neural net train error: 0.33169988311767584
Query 30, top 5 val losses [5.826, 5.976, 6.136, 6.15, 6.173]
Query 40, Meta neural net train error: 0.18474372248331705
Query 40, top 5 val losses [5.819, 5.826, 5.963, 5.976, 6.136]
Query 50, Meta neural net train error: 0.18241036819458012
Query 50, top 5 val losses [5.499, 5.819, 5.826, 5.836, 5.963]
Query 60, Meta neural net train error: 0.20234638992309567
Query 60, top 5 val losses [5.499, 5.626, 5.819, 5.826, 5.836]
Query 70, Meta neural net train error: 0.21746386678059895
Query 70, top 5 val losses [5.319, 5.499, 5.626, 5.722, 5.819]
Query 80, Meta neural net train error: 0.37769393118722094
Query 80, top 5 val losses [5.319, 5.472, 5.479, 5.499, 5.626]
Query 90, Meta neural net train error: 0.23968275416374207
Query 90, top 5 val losses [5.319, 5.472, 5.475, 5.479, 5.499]
Query 100, Meta neural net train error: 0.1880398346371121
Query 100, top 5 val losses [5.228, 5.319, 5.472, 5.475, 5.479]
Query 110, Meta neural net train error: 0.2255205536499023
Query 110, top 5 val losses [5.228, 5.319, 5.472, 5.475, 5.479]
Query 120, Meta neural net train error: 0.24608745252435854
Query 120, top 5 val losses [5.228, 5.319, 5.355, 5.472, 5.475]
Query 130, Meta neural net train error: 0.2588342261695862
Query 130, top 5 val losses [5.228, 5.319, 5.355, 5.472, 5.475]
Query 140, Meta neural net train error: 0.39286692769564113
Query 140, top 5 val losses [5.228, 5.319, 5.355, 5.472, 5.475]
Query 150, Meta neural net train error: 0.2517776324026925
Query 150, top 5 val losses [5.228, 5.319, 5.355, 5.472, 5.475]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 5, Free Memory available 13488.078125
Query 20, Meta neural net train error: 0.32107295265197766
Query 20, top 5 val losses [6.106, 6.2, 6.35, 6.414, 6.574]
Query 30, Meta neural net train error: 0.35497539909362785
Query 30, top 5 val losses [5.893, 5.946, 5.986, 6.106, 6.2]
Query 40, Meta neural net train error: 0.2584914766438802
Query 40, top 5 val losses [5.592, 5.879, 5.893, 5.893, 5.909]
Query 50, Meta neural net train error: 0.3368156593894959
Query 50, top 5 val losses [5.592, 5.879, 5.893, 5.893, 5.909]
Query 60, Meta neural net train error: 0.16931801425170895
Query 60, top 5 val losses [5.592, 5.646, 5.873, 5.879, 5.893]
Query 70, Meta neural net train error: 0.2344456712214152
Query 70, top 5 val losses [5.592, 5.646, 5.763, 5.816, 5.856]
Query 80, Meta neural net train error: 0.2722499880654471
Query 80, top 5 val losses [5.592, 5.646, 5.702, 5.729, 5.763]
Query 90, Meta neural net train error: 0.21072759886741638
Query 90, top 5 val losses [5.549, 5.592, 5.596, 5.646, 5.702]
Query 100, Meta neural net train error: 0.19599684514363605
Query 100, top 5 val losses [5.442, 5.549, 5.592, 5.596, 5.646]
Query 110, Meta neural net train error: 0.40519124424743647
Query 110, top 5 val losses [5.442, 5.549, 5.592, 5.596, 5.646]
Query 120, Meta neural net train error: 0.24805237624428492
Query 120, top 5 val losses [5.442, 5.442, 5.489, 5.549, 5.592]
Query 130, Meta neural net train error: 0.2190764184761047
Query 130, top 5 val losses [5.442, 5.442, 5.489, 5.549, 5.562]
Query 140, Meta neural net train error: 0.2567464097947341
Query 140, top 5 val losses [5.442, 5.442, 5.489, 5.536, 5.549]
Query 150, Meta neural net train error: 0.20198105940682543
Query 150, top 5 val losses [5.442, 5.442, 5.489, 5.536, 5.549]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 6, Free Memory available 12677.5859375
Query 20, Meta neural net train error: 0.6220761274719239
Query 20, top 5 val losses [6.106, 6.187, 6.293, 6.297, 6.297]
Query 30, Meta neural net train error: 0.3555480918502808
Query 30, top 5 val losses [5.893, 5.996, 6.106, 6.187, 6.2]
Query 40, Meta neural net train error: 0.23437604619344077
Query 40, top 5 val losses [5.779, 5.893, 5.906, 5.95, 5.996]
Query 50, Meta neural net train error: 0.33233889802932737
Query 50, top 5 val losses [5.779, 5.893, 5.906, 5.95, 5.996]
Query 60, Meta neural net train error: 0.23264712953186034
Query 60, top 5 val losses [5.779, 5.813, 5.829, 5.843, 5.893]
Query 70, Meta neural net train error: 0.22285690931955973
Query 70, top 5 val losses [5.489, 5.549, 5.779, 5.813, 5.829]
Query 80, Meta neural net train error: 0.282475566885812
Query 80, top 5 val losses [5.442, 5.489, 5.549, 5.779, 5.813]
Query 90, Meta neural net train error: 0.24372423615455627
Query 90, top 5 val losses [5.442, 5.489, 5.549, 5.559, 5.696]
Query 100, Meta neural net train error: 0.1839351452975803
Query 100, top 5 val losses [5.442, 5.489, 5.536, 5.549, 5.559]
Query 110, Meta neural net train error: 0.24454047916412355
Query 110, top 5 val losses [5.442, 5.489, 5.536, 5.542, 5.549]
Query 120, Meta neural net train error: 0.19209351443897593
Query 120, top 5 val losses [5.442, 5.489, 5.536, 5.542, 5.549]
Query 130, Meta neural net train error: 0.20499185137430825
Query 130, top 5 val losses [5.442, 5.489, 5.536, 5.542, 5.542]
Query 140, Meta neural net train error: 0.28143082357553334
Query 140, top 5 val losses [5.442, 5.489, 5.522, 5.536, 5.542]
Query 150, Meta neural net train error: 0.18069244881221222
Query 150, top 5 val losses [5.442, 5.489, 5.522, 5.536, 5.542]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 7, Free Memory available 11736.9375
Query 20, Meta neural net train error: 0.43352863883972165
Query 20, top 5 val losses [5.459, 5.756, 5.913, 6.076, 6.153]
Query 30, Meta neural net train error: 0.2548606317138672
Query 30, top 5 val losses [5.105, 5.459, 5.556, 5.756, 5.913]
Query 40, Meta neural net train error: 0.366904562403361
Query 40, top 5 val losses [5.105, 5.459, 5.556, 5.756, 5.773]
Query 50, Meta neural net train error: 0.2076171148872375
Query 50, top 5 val losses [5.105, 5.208, 5.258, 5.258, 5.459]
Query 60, Meta neural net train error: 0.23118201963806148
Query 60, top 5 val losses [5.105, 5.208, 5.258, 5.258, 5.459]
Query 70, Meta neural net train error: 0.2616974017206828
Query 70, top 5 val losses [5.105, 5.105, 5.208, 5.208, 5.258]
Query 80, Meta neural net train error: 0.22984790934971402
Query 80, top 5 val losses [5.105, 5.105, 5.208, 5.208, 5.258]
Query 90, Meta neural net train error: 0.22157757285118101
Query 90, top 5 val losses [5.105, 5.105, 5.208, 5.208, 5.258]
Query 100, Meta neural net train error: 0.19160059139675564
Query 100, top 5 val losses [5.105, 5.105, 5.208, 5.208, 5.258]
Query 110, Meta neural net train error: 0.2084430541229248
Query 110, top 5 val losses [4.945, 5.105, 5.105, 5.208, 5.208]
Query 120, Meta neural net train error: 0.18546269156716083
Query 120, top 5 val losses [4.945, 5.105, 5.105, 5.208, 5.208]
Query 130, Meta neural net train error: 0.17551071809768676
Query 130, top 5 val losses [4.945, 5.105, 5.105, 5.208, 5.208]
Query 140, Meta neural net train error: 0.35753696122389567
Query 140, top 5 val losses [4.945, 5.105, 5.105, 5.208, 5.208]
Query 150, Meta neural net train error: 0.17338022646222792
Query 150, top 5 val losses [4.945, 5.105, 5.105, 5.208, 5.208]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 8, Free Memory available 11168.69921875
Query 20, Meta neural net train error: 0.30462424118042
Query 20, top 5 val losses [6.16, 6.32, 6.337, 6.41, 6.505]
Query 30, Meta neural net train error: 0.2668299346923828
Query 30, top 5 val losses [6.106, 6.16, 6.32, 6.337, 6.41]
Query 40, Meta neural net train error: 0.2580552927652995
Query 40, top 5 val losses [5.719, 5.813, 6.106, 6.16, 6.237]
Query 50, Meta neural net train error: 0.2070570923995972
Query 50, top 5 val losses [5.719, 5.719, 5.813, 5.813, 5.919]
Query 60, Meta neural net train error: 0.21408739169311525
Query 60, top 5 val losses [5.719, 5.719, 5.813, 5.813, 5.873]
Query 70, Meta neural net train error: 0.21550944006601971
Query 70, top 5 val losses [5.716, 5.719, 5.719, 5.813, 5.813]
Query 80, Meta neural net train error: 0.24081688992091582
Query 80, top 5 val losses [5.716, 5.719, 5.719, 5.739, 5.813]
Query 90, Meta neural net train error: 0.24684562508583072
Query 90, top 5 val losses [5.716, 5.719, 5.719, 5.739, 5.813]
Query 100, Meta neural net train error: 0.2344628215026856
Query 100, top 5 val losses [5.526, 5.592, 5.716, 5.719, 5.719]
Query 110, Meta neural net train error: 0.33575948421478274
Query 110, top 5 val losses [5.425, 5.526, 5.592, 5.686, 5.716]
Query 120, Meta neural net train error: 0.26693639932805846
Query 120, top 5 val losses [5.425, 5.526, 5.562, 5.592, 5.682]
Query 130, Meta neural net train error: 0.21699908062616985
Query 130, top 5 val losses [5.425, 5.425, 5.526, 5.539, 5.562]
Query 140, Meta neural net train error: 0.31123031298123877
Query 140, top 5 val losses [5.425, 5.425, 5.485, 5.526, 5.539]
Query 150, Meta neural net train error: 0.3141454306411743
Query 150, top 5 val losses [5.425, 5.425, 5.485, 5.526, 5.539]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 9, Free Memory available 10233.87109375
Query 20, Meta neural net train error: 0.312837296295166
Query 20, top 5 val losses [5.813, 6.03, 6.083, 6.12, 6.15]
Query 30, Meta neural net train error: 0.30551274005889895
Query 30, top 5 val losses [5.813, 5.846, 5.879, 6.03, 6.083]
Query 40, Meta neural net train error: 0.27968885782877606
Query 40, top 5 val losses [5.696, 5.776, 5.779, 5.813, 5.829]
Query 50, Meta neural net train error: 0.35744497243881224
Query 50, top 5 val losses [5.442, 5.582, 5.696, 5.776, 5.779]
Query 60, Meta neural net train error: 0.20645350143432614
Query 60, top 5 val losses [5.442, 5.582, 5.672, 5.696, 5.776]
Query 70, Meta neural net train error: 0.14485750765482588
Query 70, top 5 val losses [5.442, 5.445, 5.582, 5.586, 5.672]
Query 80, Meta neural net train error: 0.3652051569366455
Query 80, top 5 val losses [5.442, 5.445, 5.582, 5.586, 5.649]
Query 90, Meta neural net train error: 0.19484986895561215
Query 90, top 5 val losses [4.945, 5.442, 5.445, 5.582, 5.586]
Query 100, Meta neural net train error: 0.1896125703854031
Query 100, top 5 val losses [4.945, 5.442, 5.445, 5.519, 5.544]
Query 110, Meta neural net train error: 0.32435621636199946
Query 110, top 5 val losses [4.945, 5.362, 5.429, 5.442, 5.445]
Query 120, Meta neural net train error: 0.22109225515192205
Query 120, top 5 val losses [4.945, 5.362, 5.429, 5.442, 5.445]
Query 130, Meta neural net train error: 0.23432591149648027
Query 130, top 5 val losses [4.945, 5.362, 5.429, 5.442, 5.445]
Query 140, Meta neural net train error: 0.2926189739168607
Query 140, top 5 val losses [4.945, 5.362, 5.429, 5.442, 5.445]
Query 150, Meta neural net train error: 0.21124934818812782
Query 150, top 5 val losses [4.945, 5.362, 5.429, 5.442, 5.442]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 10, Free Memory available 9766.75
Query 20, Meta neural net train error: 0.2848729610443117
Query 20, top 5 val losses [5.899, 6.063, 6.11, 6.2, 6.357]
Query 30, Meta neural net train error: 0.33514286155700684
Query 30, top 5 val losses [5.559, 5.899, 6.036, 6.063, 6.066]
Query 40, Meta neural net train error: 0.3059166374460856
Query 40, top 5 val losses [5.539, 5.559, 5.899, 5.906, 6.03]
Query 50, Meta neural net train error: 0.1952446213531494
Query 50, top 5 val losses [5.482, 5.539, 5.559, 5.606, 5.746]
Query 60, Meta neural net train error: 0.16945469247436523
Query 60, top 5 val losses [5.105, 5.482, 5.539, 5.559, 5.606]
Query 70, Meta neural net train error: 0.14652121493021644
Query 70, top 5 val losses [5.105, 5.258, 5.459, 5.482, 5.539]

from naszilla.

auroua avatar auroua commented on May 27, 2024

You should explicitly release the tensorflow session object in bananas function in file nas_algorithms.py. The following is the output after explicitly release the session object.


* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 0, Free Memory available 23250.765625

Query 20, Meta neural net train error: 0.3487703415679933
Query 20, top 5 val losses [5.843, 5.879, 6.02, 6.026, 6.133]
Query 30, Meta neural net train error: 0.15054341629028326
Query 30, top 5 val losses [5.843, 5.879, 5.896, 6.013, 6.02]
Query 40, Meta neural net train error: 0.25547135447184244
Query 40, top 5 val losses [5.779, 5.843, 5.879, 5.879, 5.896]
Query 50, Meta neural net train error: 0.22749369312286385
Query 50, top 5 val losses [5.779, 5.843, 5.879, 5.879, 5.896]
Query 60, Meta neural net train error: 0.21580750080871583
Query 60, top 5 val losses [5.722, 5.779, 5.813, 5.843, 5.879]
Query 70, Meta neural net train error: 0.14051182483673091
Query 70, top 5 val losses [5.722, 5.779, 5.813, 5.829, 5.843]
Query 80, Meta neural net train error: 0.2027042778996059
Query 80, top 5 val losses [5.722, 5.769, 5.779, 5.813, 5.829]
Query 90, Meta neural net train error: 0.14665883271217345
Query 90, top 5 val losses [5.559, 5.719, 5.722, 5.769, 5.779]
Query 100, Meta neural net train error: 0.22781972122192382
Query 100, top 5 val losses [5.559, 5.559, 5.599, 5.719, 5.722]
Query 110, Meta neural net train error: 0.2804164693450928
Query 110, top 5 val losses [5.559, 5.559, 5.596, 5.599, 5.719]
Query 120, Meta neural net train error: 0.25520122894287106
Query 120, top 5 val losses [5.559, 5.559, 5.596, 5.599, 5.719]
Query 130, Meta neural net train error: 0.20952919589360555
Query 130, top 5 val losses [5.546, 5.559, 5.559, 5.559, 5.596]
Query 140, Meta neural net train error: 0.30798990099393403
Query 140, top 5 val losses [5.482, 5.546, 5.559, 5.559, 5.559]
Query 150, Meta neural net train error: 0.1546503270830427
Query 150, top 5 val losses [5.305, 5.439, 5.482, 5.539, 5.546]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 1, Free Memory available 19645.84375
Query 20, Meta neural net train error: 0.5405180290985109
Query 20, top 5 val losses [6.076, 6.303, 6.303, 6.337, 6.54]
Query 30, Meta neural net train error: 0.3941140726089477
Query 30, top 5 val losses [5.813, 5.829, 6.076, 6.09, 6.115]
Query 40, Meta neural net train error: 0.2905621319071452
Query 40, top 5 val losses [5.813, 5.829, 5.879, 5.906, 5.95]
Query 50, Meta neural net train error: 0.21056627208709716
Query 50, top 5 val losses [5.549, 5.696, 5.729, 5.813, 5.829]
Query 60, Meta neural net train error: 0.19408362345886232
Query 60, top 5 val losses [5.549, 5.696, 5.729, 5.813, 5.813]
Query 70, Meta neural net train error: 0.19322695072174073
Query 70, top 5 val losses [5.549, 5.696, 5.729, 5.813, 5.813]
Query 80, Meta neural net train error: 0.3732918584224155
Query 80, top 5 val losses [5.549, 5.696, 5.729, 5.813, 5.813]
Query 90, Meta neural net train error: 0.25575729636192324
Query 90, top 5 val losses [5.549, 5.696, 5.729, 5.813, 5.813]
Query 100, Meta neural net train error: 0.14205466328938798
Query 100, top 5 val losses [5.549, 5.559, 5.696, 5.729, 5.813]
Query 110, Meta neural net train error: 0.25456465628814695
Query 110, top 5 val losses [5.362, 5.482, 5.549, 5.559, 5.696]
Query 120, Meta neural net train error: 0.158599561621926
Query 120, top 5 val losses [5.362, 5.482, 5.549, 5.559, 5.682]
Query 130, Meta neural net train error: 0.22499610691706334
Query 130, top 5 val losses [5.362, 5.482, 5.549, 5.559, 5.606]
Query 140, Meta neural net train error: 0.260042774135883
Query 140, top 5 val losses [5.362, 5.482, 5.549, 5.556, 5.559]
Query 150, Meta neural net train error: 0.22109316365923198
Query 150, top 5 val losses [5.362, 5.445, 5.482, 5.536, 5.549]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 2, Free Memory available 19537.70703125
Query 20, Meta neural net train error: 0.5766716690063477
Query 20, top 5 val losses [5.849, 6.083, 6.267, 6.337, 6.397]
Query 30, Meta neural net train error: 0.3071524828338623
Query 30, top 5 val losses [5.559, 5.729, 5.763, 5.779, 5.849]
Query 40, Meta neural net train error: 0.25682980814615886
Query 40, top 5 val losses [5.442, 5.559, 5.709, 5.729, 5.763]
Query 50, Meta neural net train error: 0.39493933048248286
Query 50, top 5 val losses [5.442, 5.549, 5.559, 5.696, 5.709]
Query 60, Meta neural net train error: 0.33578516896057126
Query 60, top 5 val losses [5.442, 5.549, 5.559, 5.696, 5.696]
Query 70, Meta neural net train error: 0.3033694334411621
Query 70, top 5 val losses [5.442, 5.549, 5.559, 5.606, 5.696]
Query 80, Meta neural net train error: 0.32578006709507534
Query 80, top 5 val losses [5.442, 5.442, 5.549, 5.559, 5.606]
Query 90, Meta neural net train error: 0.3083279574775696
Query 90, top 5 val losses [5.442, 5.442, 5.549, 5.559, 5.606]
Query 100, Meta neural net train error: 0.2510057151285807
Query 100, top 5 val losses [5.439, 5.442, 5.442, 5.549, 5.559]
Query 110, Meta neural net train error: 0.28571165019989014
Query 110, top 5 val losses [5.439, 5.442, 5.442, 5.549, 5.559]
Query 120, Meta neural net train error: 0.22416895188071512
Query 120, top 5 val losses [5.439, 5.442, 5.442, 5.549, 5.559]
Query 130, Meta neural net train error: 0.26137030934015903
Query 130, top 5 val losses [5.439, 5.442, 5.442, 5.549, 5.559]
Query 140, Meta neural net train error: 0.4445019862248348
Query 140, top 5 val losses [5.439, 5.442, 5.442, 5.549, 5.559]
Query 150, Meta neural net train error: 0.23183432327815465
Query 150, top 5 val losses [5.439, 5.442, 5.442, 5.489, 5.549]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 3, Free Memory available 19502.85546875
Query 20, Meta neural net train error: 0.5223561430358888
Query 20, top 5 val losses [5.879, 6.09, 6.106, 6.14, 6.21]
Query 30, Meta neural net train error: 0.14834258678436285
Query 30, top 5 val losses [5.846, 5.879, 5.893, 5.996, 6.09]
Query 40, Meta neural net train error: 0.42141343673706055
Query 40, top 5 val losses [5.634, 5.729, 5.813, 5.833, 5.846]
Query 50, Meta neural net train error: 0.2237832831382752
Query 50, top 5 val losses [5.634, 5.729, 5.756, 5.779, 5.799]
Query 60, Meta neural net train error: 0.2938297483215332
Query 60, top 5 val losses [5.419, 5.634, 5.729, 5.756, 5.779]
Query 70, Meta neural net train error: 0.17818667655944828
Query 70, top 5 val losses [5.419, 5.512, 5.612, 5.634, 5.636]
Query 80, Meta neural net train error: 0.27593417443411694
Query 80, top 5 val losses [5.419, 5.512, 5.612, 5.634, 5.636]
Query 90, Meta neural net train error: 0.1877837141990662
Query 90, top 5 val losses [5.419, 5.512, 5.542, 5.612, 5.634]
Query 100, Meta neural net train error: 0.23295990307278108
Query 100, top 5 val losses [5.419, 5.512, 5.542, 5.612, 5.619]
Query 110, Meta neural net train error: 0.20185477804565427
Query 110, top 5 val losses [5.415, 5.419, 5.512, 5.542, 5.612]
Query 120, Meta neural net train error: 0.2099861141898416
Query 120, top 5 val losses [5.415, 5.419, 5.425, 5.472, 5.512]
Query 130, Meta neural net train error: 0.23498687627156575
Query 130, top 5 val losses [5.415, 5.419, 5.425, 5.472, 5.512]
Query 140, Meta neural net train error: 0.41029146479679995
Query 140, top 5 val losses [5.415, 5.419, 5.425, 5.442, 5.472]
Query 150, Meta neural net train error: 0.1512694839804513
Query 150, top 5 val losses [5.415, 5.419, 5.425, 5.442, 5.472]

* Trial summary: (params, results, walltimes)

* Running algorithm: {'algo_name': 'bananas', 'total_queries': 150}

* Trials: 4, Free Memory available 19459.86328125
Query 20, Meta neural net train error: 0.4353225244903564
Query 20, top 5 val losses [5.799, 6.03, 6.2, 6.297, 6.337]
Query 30, Meta neural net train error: 0.5584189819717407
Query 30, top 5 val losses [5.799, 5.879, 5.929, 6.03, 6.03]

from naszilla.

crwhite14 avatar crwhite14 commented on May 27, 2024

Thanks for the question. We recently fixed this by adding
tf.reset_default_graph() and tf.keras.backend.clear_session() inside https://github.com/naszilla/bananas/blob/master/nas_algorithms.py#L188

from naszilla.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.