Giter Club home page Giter Club logo

trace_classifier's People

Contributors

shashadehuajiang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

trace_classifier's Issues

bug

   def data_enhancement(self,X):
        if self.cfg.PACKET2FLOW == '2dCNN':
            return X # not ready yet

        if self.cfg.PACKET2FLOW == '1dCNN' and self.cfg.SCALE_1dCNN:
            return X # not ready yet

        X_new = copy.deepcopy(X)
        
        for i_f in range(len(X_new)):
            for i_v in range(len(X_new[i_f])):
                X_new[i_f][i_v] = X_new[i_f][i_v][0:2]

        rand_num = random.random()

        if rand_num>0.5 and rand_num<0.7:
            # 随机丢弃
            for i_f in range(len(X_new)):
                for i_v in range(len(X_new[i_f])-1,-1,-1):
                    rand_threshold = random.random()*0.1
                    if random.random() < rand_threshold:
                        del X_new[i_f][i_v]

        if rand_num>0.7 and rand_num<0.8:
            # 时间平移(保序)
            for i_f in range(len(X_new)):
                for i_v in range(1,len(X_new[i_f])):
                    if i_v == len(X_new[i_f]) - 1:
                        max_size = X_new[i_f][i_v][0]/2
                        random_biaos = (random.random()*2-1)*max_size
                        X_new[i_f][i_v][0] -= random_biaos
                    else:
                        max_size = min(X_new[i_f][i_v][0],X_new[i_f][i_v+1][0])
                        random_biaos = (random.random()*2-1)*max_size
                        X_new[i_f][i_v][0] -= random_biaos
                        X_new[i_f][i_v+1][0] += random_biaos
        
        if rand_num>0.8 and rand_num<1:
            # 前后随机cut
            max_time_list = [max([X_new[i_f][i_v][0] for i_v in range(0,len(X_new[i_f]))]) for i_f in range(len(X_new))] 
            max_time = max(max_time_list)
            
            startcut = (random.random()*0.2) * max_time
            endcut = (1 - random.random()*0.2) * max_time
            for i_f in range(len(X_new)-1,-1,-1):
                for i_v in range(len(X_new[i_f])-1,-1,-1):
                    if X_new[i_f][i_v][0] < startcut or X_new[i_f][i_v][0] > endcut:
                        del X_new[i_f][i_v]
                if len(X_new[i_f]) == 0:
                    del X_new[i_f]



        for i_f in range(len(X_new)):
            for i_v in range(len(X_new[i_f])):
                # 添加绝对时间
                if i_v>=1:
                    time_sum = X_new[i_f][i_v][0] + X_new[i_f][i_v-1][-1]
                    X_new[i_f][i_v] = np.r_[X_new[i_f][i_v],time_sum]
                else:
                    X_new[i_f][i_v] = np.r_[X_new[i_f][i_v],0]
                # 添加顺序,并clip
                id_pad = min(i_v/100,1)
                X_new[i_f][i_v] = np.r_[X_new[i_f][i_v],id_pad]
                # 添加百分比
                percentage = i_v/len(X_new[i_f])
                X_new[i_f][i_v] = np.r_[X_new[i_f][i_v],percentage]

        return X 

这个地方是错的吧 最后返回的是X_new

训练速度

为什么在计算卡上的速度不如本机的普通显卡的速度,是需要修改什么吗

Processing dataset

您好,我最近才开始学习深度学习,看到了您的文章,我下载了ISD的原来的数据集,我将两种分类的pcap使用dataset_maker的时候最后得到的json文件里面的数据的长度仅为本来pcap文件的个数,而不是文章的4402,请问一下是怎么切分的呢?我需要修改什么才能得到4402个长度的数据吗,期待您的答复解决一下现阶段的问题。

dataset

Hello author, I would like to ask how to use public datasets such as UAV, SWF, KWS with this framework mentioned in the article. Thank you for your reply.

The problem of Loss equal to nan

We can obtain normal results when reproducing the "LSTM+ATT" setting, but when setting PACKET2FLOW in config.py to other values, nan occurs.

We guess it is a learning rate problem, but the paper does not give the learning rate for each PACKET2FLOW setting.

Can you provide specific hyperparameter settings when PACKET2FLOW and FLOW2TRACE take different values?

Processing dataset

按照你给的dataset_maker,将其他数据集转换成json文件后,发现和你的json文件特征有很大的出处(训练效果很不好),怎么只提取时间戳和带序列方向的数据包大小?

Implementation details about "Unlimited Flow length"

When we reproduce the experimental results in "4.3.3 The Benefits of No Traffic Customization (TC)", we encounter some problem about implementation details.

It is a normal practice to train the model by fixing the flow length, but how to understand "Unlimited"? Normally, it should be difficult to train normally if different data samples in the same batch have different sequence lengths.

We are curious about how to achieve normal training of the model under the "Unlimited Flow Size" setting.

Thank you for your time and consideration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.