trinkle23897 / tuixue.online-visa Goto Github PK
View Code? Open in Web Editor NEWhttps://tuixue.online/visa/ A Real-time Display of U.S. Visa Appointment Status Website 预约美帝签证各个签证处最早时间的爬虫
Home Page: https://tuixue.online/visa
https://tuixue.online/visa/ A Real-time Display of U.S. Visa Appointment Status Website 预约美帝签证各个签证处最早时间的爬虫
Home Page: https://tuixue.online/visa
Hi,
Has anyone investigated rate-limit policy of AIS available appointment API?
https://ais.usvisa-info.com/{country_code}/niv/schedule/{schedule}/appointment/days/{facility_id}.json?appointments[expedite]=false
Well, I have launched some investigations, what I've found out until now is that the rate-limit is based on user identity, not IP address or something else. And it uses a simple counter for number of requests, not common rate-limiting algorithms such as fixed-window or sliding window, etc.
I created this issue to share our findings in this regard. Please share your investigations. I hope we can find a way to bypass the rate-limit.
I haven't tried this myself before, but have you considered crawling with Tor to avoid IP address blocking?
非常感谢!
感激不尽
这两天在想关于之前讨论组里关于api endpoint的设计问题
一个get当前时间,一个get连续一段时间内的变化范围(每天的最早、最晚、众数),一个get一整天的原始数据
目前的api设计(/backend/global
)在接收请求的时候,接收的是region + sys,再在api内部计算出embassy/consulate列表。现在看起来这其实是一个很不flexible的设计,因为在数据库层面visa status是按照(visa_type, embassy_code, Optional[write_date])
这个二(三)元组存储的,那么后期如果加region/sys或者其他的分类方式,这整个route都要重写。所以我觉得可以优化一下这一块的设计。
大体思路是,把如何分类/组合这个比较多变的逻辑交给前端,和数据库查询有关的route只接收visa_type
,embassy_code
和其他与pagination有关的query。前端页面加载时会先AJAX一个请求给后端读取像region
/embassy_lst
这类meta data,然后通过default值再AJAX一个请求读取具体的visa status数据。React(或者其他带AJAX的现代JS前端框架)的一个优势就在于,当浏览器通过网络请求从数据库读取数据的时候,用户不会看到一个空白页面,而是会有部分已经加载好的信息显示在浏览器了,所以我觉得虽然会有两个AJAX request,UX/UI这一块应该是okay的。
这里的我现在想的是,所有和GET签证可预约时间有关的endpoints都以/visastatus
起头,具体如下:
/visastatus/meta
global_var.py
,这样的好处是前端代码库不需要定义这些变量,只用维护后端global_var就可以。/visastatus/earliest?visa_type={vt}&embassy_code={ec}&since={dt}&to={dt}
/backend/global
的query,把region
和sys
这两个query param移除,增加了embassy_code
,这样的话这个(和后面的一些)endpoint的逻辑基本上就可以保持不变了,分页的话我现在觉得改成since
和to
来申明具体的时间跨度会比较好,分页的逻辑可以在前端从数字(之前的skip & take
)转化为UTC string,FastAPI会自动转换成python datetime object。/visastatus/latest?visa_type={vt}&embassy_code={ec}
/visastatus/{visa_type}/{embssy_code}?since={dt}&to={dt}
since
和to
来限制数据量。事实上如果抓取所有历史数据后端的计算压力是很大的,因为新写入数据不保序(记得之前讨论过)所以会涉及排序。但是如果能够限制查询的数据量到一天的话就还好,而且考虑到这个endpoint相较于前两个可能不会有那么大的访问量(纯yy有待实践检测),所以为了flexibility选择了这个默认设置P.S.:上面的这些route里
visa_type
和embassy_code
都可以做成接收list
Similar to #67
I'm trying to subscribe to L1 appointments for the Melbourne consulate. I don't receive an email confirmation for the subscription.
@Trinkle23897 你好,因为签证问题想自己写一个类似的爬虫,搜了一下,发现大佬你已经写好了!果断用用了!微信刚刚转了个红包表示感谢!
请问你是如何获取签证的预约时间(虽然不用自己写了,但是还是求知一下)?
我大致看了一下代码,这里好像有注册随机用户的地方
tuixue.online-visa/visa2/visa.py
Line 46 in 0428b24
Hello,
Thanks for all the work you've done on this project. I was curious how often these dates are checked? I've been having trouble finding the dates that are generated from the telegram channel online (ais) seconds after they occur. Is the FastApi endpoint more real time?
Thanks!
麻烦了,谢谢!
Did not see Italy is included, is it possible to add Italy as well?
Thank you so much
i have subscribed for india consulates, but i am not getting any notifications
Can you please check whether below email id's are added or not
[email protected]
[email protected]
老哥,伦敦的美签停了吗。
Hi!
Thanks for doing this and it's really helping a lot of people. Just wondering is it possible to add a TCN=NO update for Canada consulates? The slot allocated for TCN=Yes is just too limited.
Thanks!
请问老哥是否可以搞一个最新时间提醒功能(例如邮箱推送),目前蓝色高亮显示确实很实用,但可能还是要一直开着网页,定期去看。感谢!
太感谢你们的工作了!!请问如果想在你们的基础上开发针对肯尼亚(AIS)系统的爬虫,从哪里开始比较好呢?有文档参考吗?感谢!!!🙏
Notification, both the email and website one seems not working.
Tried on hotmail.com and outlook.com, for B1/B2
Hello, I am trying to run the api server, but had issues with the node.txt
not defined.
node
as the input parameters?tuixue.online-visa/api/tuixue/views.py
Line 74 in 31bc185
node.txt
, what is needed for this filetuixue.online-visa/api/tuixue/ais_reg.py
Line 36 in 31bc185
首先感谢lz啊 做出了超实用工具!然后我看了下温哥华就没有数据,然后我搜了下历史的issue,是因为爬虫被ban了么?所以这个是没法解决的么?
无法获取一个新的 session
在本地测试用url:http://127.0.0.1:8888/register/?type=F&place=北京
测试时返回{"code": 402, "msg": "Network Error"}
可能的原因是,Login无法正常从原来的URL里拿到response了。
打印response会看到 status code为403
订阅沈阳f1,日期设置8月1日,今天9:48收到了邮件提醒,18:48没有收到,垃圾邮件里也没有。不知道是哪里出了问题
有助于蹲个捡漏~
rt,fetch了一个日期的可选时间,结果返回了一个空列表[]或者[None]。
这是因为被人提前抢了么,还是一种特殊的不能预约的日期呢,或者是防bot的假日期?
非常赞的工具!尤其是加了网页版非常用户友好。 请问您有没有接触过VFS Global的英国预约?
再次感谢这么棒的项目!!想问一下穿透通知是会跳 windows notification 吗
你好,我想着直接调用ais_reg和ais_refersh,但是我发现webdriver会报错在这里https://github.com/Trinkle23897/tuixue.online-visa/blob/master/api/tuixue/ais_reg.py#L36.
请问这个node.txt是怎么来的。我怎么才能生成这样一个node.txt。
谢谢。
visa.json没有读懂。。就在爬城市每天的这个界面,但好像这个网页是更新优先级比较低吗,下午16:48的58才爬到变了。还有就是想问一下visa.json应该怎么爬orz
先拜拜jiayi大神,以及现在人在金边没法扫微信/支付宝捐款,因为地域限制什么的。。。什么时候支持一波境外支付lol。
另外虽然人已经到柬埔寨了,但是没有成功预约上non-resident面签,想要写一个自动抢位的脚本,有几个技术问题想问一下(当然如果觉得会侵犯到你的知识产权请直接拒绝,我是完全理解的lol
https://tuixue.online/global/crawler/F/金边/2020/{MM}/{DD}
就可以了As titled, seems the Django backend no longer works given no dates shown with an unpaid account? Any workaround?
@Trinkle23897 and I discussed about handling timezone related issues in the backend. The way we decided to handle it is a bit bizarre and requires detailed & on point documentation. This issue serves this purpose.
The data in the production server is stored in specific structured files /visa_type/location/YYYY/MM/DD
where DD
is a text file that stores all the successful fetched results for the date YYYY/MM/DD
(file-path) in the granularity of minutes. Each line in a DD
file is composed in the format of hh:mm YYYY/MM/DD
where hh:mm
is the time the result is fetched and the string YYYY/MM/DD
(file-line) is the fetched result of available appointment date. When looking at a given /visa_type/location/YYYY/MM/DD
in the line of hh:mm YYYY/MM/DD
, it tells you that "On the date of YYYY/MM/DD
(file-path) at the moment of hh:mm
, one can schedule an appointment on the date of YYYY/MM/DD
(file-line) for {visa_type} Visa at the U.S. Embassy/Consulate in {location}.
If we describe YYYY/MM/DD hh:mm
(file-path) as write_time
and YYYY/MM/DD
(file-line) as available_date
(for Visa interview appointment). We have an issue here:
All of the write_time
is in the timezone of UTC+8 offset where as all of the available_date
are the date in the local timezone of a given U.S. Embassy.
Currently data in Mongo is stored in documents defined by Visa type, embassy code and the date of fetching, which is the time in UTC+8 offset. But in Mongo it assumes all datetime are in UTC standard time. To fix the issue, simply converted the write_time
is not enough, specifically for the visa status overview.
The visa status overview contains the metadata (overview) calculated from the fetched data of a given fetching date (write_date
), including the earliest available appointment date and latest available appointment date (minimum and maximum datetime). An new issue derives here: the previously calculated overview data compares all data fetched from 00:00 to 23:59 in UTC+8 offset. So if the new overview data is calculated from 00:00 to 23:59 in UTC+0 standard time, the new overview data will be different from the previously calculated one.
In the perspective of a user, when looking at the overview of available dates, what the user most likely cares about is when s/he goes to the country of the U.S. Embassy locates, what's the available date in the local time. So it seems reasonable to calculate the earliest and latest in the scope of 00:00 to 23:59 in the local time zone.
So here below is the solution for handling the timezone issue in the backend:
available_date
data are stored as is. (what we fetch is what we store)write_time
and write_date
data in Mongo collections visa_status
and latest_written
are stored in UTC+0 standard time.write_time
and write_date
data in Mongo collection overview
are stored in the time in the local time zone of a given U.S. Embassy location. e.g. The overview data of U.S. Embassy in Phnom Pend on the date Oct 10th, 2020 stands for the time range "2020-10-10T00:00+07:00"
to "2020-10-10T23:59+07:00"
, NOT "2020-10-10T00:00+00:00"
to "2020-10-10T23:59+00:00"
.Date.toISOString
is the default way we construct the time related query in a request url in frontend. FastAPI backend should add a layer of logic that consolidate the received datetime object must have a tzinfo
attribute otherwise should return a 422 status code.Hello, I would gladly make a pull request, but I don't see where I could pull consulate codes for Poland's Krakow and Warsaw cities.
Thank you
Hi from I experience u need use paid user and after than u can scan every 30 sec.
This better chose.
Hi~ Thanks for this amazing tool. Just wonder if you still monitor the slots in Bern, Switzerland. I joined the telegram group for visa type B and saw the last message related to Bern was back in March.
Thanks in advance!
如题,这样可以看到是否有戏可以捡漏
网页上温哥华领馆的B2签证一直没有数据,订阅邮件提醒也没有收到确认
Hello, is there any way to have more in real time the alerts in the telegram group?
As far as I read some code, the application alerts every 10 minutes?
I cant deploy the phyton app in my local for test by myself, am i forgot something, the error i have is that i dont have the file secrets.json
Hi,
感谢这个非常实用的工具!发现邮件订阅功能好像没法收到确认邮件,是挂了么?还是不用确认会直接active?
BTW,想问下telegram的订阅还有么?在网站上好像没看到说的the tab next to the chart?
如题,想捡个比我已经预约的时间早的漏。但是现行提醒策略多数情况下的提醒其实都是不必要的。不知道设置一个手动阈值是否方便?
这个issue负责专门记录下一版前端的需求
Today(2/2) I get the Newest Fetch time of Shanghai as 2/27, but I cannot see it in my appointment page, the earliest time is 4/20. Is that possible this date is invalid due to the "Date Ahead" because it should be made 37~71 days in advance?
你好,想请教一个问题,我今天(2/2)获取到L签证最新的可预约时间是2/27,但是我到签证页查看发现并没有这个日期,我可以约的最早日期是4/20,如果排除掉被其他人预约了的可能,这个情况是否和Date Ahead有关?因为这天的面签必须提前37~71天预约?
appointments should be made 37 ~ 71 days in advance
今年三月在德国预约签证的时候还有相关信息。但最近一个月查看,发现德国的数据没有了,请问是什么原因呢?会恢复吗?谢谢!
It looks like the app has been down (or not generating data for one reason or another) since 2023/07/17 at ~11 AM CST.
Thanks so much for the amazing project!
Also, please let me know if this is not the best way to get in contact with you.
如题,跪谢
Hi, thanks for such a nice tool. My friends got the visa appointment successfully one month ago.
But in the recent two weeks, I and some friends cannot receive any email notifications. My appointment is for B1/B2 in Germany.
Could you please help us to check the email notifications?
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.