12 comments

  • SahAssar 1 day ago

    Since it wraps existing browsers (chromium/safari/electron/webkit) is it really fair to not say that in the readme? Sure it mentions that you should install them under "quickstart", but all the other parts compare itself directly against other browsers when this really seems to fill more of a "plugin-like" role than a full browser.

    Also, when doing readme's for ambitious projects it's probably best to mark what features are complete and which ones are just planned.

    Right now it looks like the whole V-lang debacle IMO with lots of features promised and no way to know what is implemented and not, while wrapping other projects without acknowledging it.

    • cookiengineer 1 day ago

      > Also, when doing readme's for ambitious projects it's probably best to mark what features are complete and which ones are just planned.

      I think that's what issues and milestones are for. I tried to document as much as possible. Currently, a lot of UI features are not ready, and the HTML and CSS parser both need major rework.

      > Right now it looks like the whole V-lang debacle IMO with lots of features promised and no way to know what is implemented and not, while wrapping other projects without acknowledging it.

      Not a single external library included. Pretty much everything implemented by myself over the last 1 1/2 years.

      Please don't confuse a Browser with a Browser Engine. I never claimed to replace WebKit, Blink and neither Gecko.

      For now, I, as a single person working on the project, just didn't have the time to fork Servo into a reduced library that can be bundled anyhow - because that's a shitload of work to be done. For now I'm focussing on parsing and filtering, because the underlying network concept was proven to work. I had to build a test runner, too, to be able to even test these behaviours.

      Currently the prototype of the UI is implemented in HTML5, reusing as much as possible from the nodejs-side ES2018 modules.

      • SahAssar 1 day ago

        > I think that's what issues and milestones are for. I tried to document as much as possible.

        I agree, but then those should not be mentioned in the readme as "Features". Are there any things mentioned in the readme in present tense that are not currently implemented?

        > Please don't confuse a Browser with a Browser Engine. I never claimed to replace WebKit, Blink and neither Gecko.

        In https://github.com/tholian-network/stealth/blob/X0/browser/b... you spawn chromium (or other browsers). Chromium is a browser.

        I'm not saying that you have to fork or implement a browser, I'm just saying that if you wrap an existing browser it's best to be upfront about it and not make it sound like you have implemented a browser. I wouldn't have objected if the readme said something like "It uses chromium/electron or other browsers under the hood" at the beginning.

        This is especially critical if you claim improved privacy or security.

        > For now, I, as a single person working on the project

        I'm not complaining about the quality or the scope of the work. I'm saying that what the readme presents as the current status of the project does not match reality.

        • cookiengineer 1 day ago

          I totally agree with your points. Currently, Stealth reuses existing Webviews in order to get to the MVP as fast as possible.

          Later on, there probably will be a custom Browser Engine fork that only contains the necessary features.

          Thanks for the hints, I'm gonna read through the README again and update it to be fully transparent.

          • SahAssar 12 hours ago

            I saw that you updated the readme to include that it is based on other browsers, great!

            The feature list still seems intact though, so if that is intended to represent the current state perhaps you can help me understand these points (And please forgive me if you just intended to update it later):

            ---

            > It uses trust-based Peers to share the local cache. Peers can receive, interchange, and synchronize their downloaded media

            What is a trusted peer? How is trust established?

            > It is peer-to-peer and always uses the most efficient way to share resources and to reduce bandwidth, which means downloaded websites are readable even when being completely offline.

            What P2P tech is used here? Again, how is trust handled for resources downloaded over P2P networks?

            > The Browser is completely free-of-DOM

            Is createElement, appendChild and similar other methods used here not part of the DOM? https://github.com/tholian-network/stealth/blob/1efe80f8baf6...

            ---

            I'm saying that if you make extraordinary claims you will need extraordinary evidence. Or say that they are ambitions until you can produce that evidence, which is completely fine!

    • mmerlin 1 day ago

      Many innovative ideas here.

      I especially like the shared cache with trusted peers idea, and remotely scriptable sounds useful too.

      But no web forms?

      How do we interact with websites (like me posting this comment now for example?)

      And no DOM? What is the DOM translated into?

      And no ECMAscript? Won't that break half the web from being usable?

      And yet it can also become a web proxy for regular browsers?

      Curious to watch this project mature, as it seems there are several excellent lateral ideas all being developed at once!

      • _where 1 day ago

        If the network isn’t free and data is centralized, one day you think you have it all and the next you could have nothing. Tor pretends to be secure, but is dark and compromised. This project seems to understand that and wants to try again to fix via P2P in a way that has promise.

        The simple implementation of web forms is broken in today’s web. It’s an input field or other element styled as an input field that may or may not be grouped or in a form object, possibly generated dynamically. Websockets, timers, validations, ... it’s a huge PITA.

        The DOM is a freaking mess. It’s not there until it’s there, it’s detached, it’s shared. It’s been gangbanged so much, there’s no clear parent anymore.

        ECMAScript- which version and which interpretation should Babel translate for you, and would you like obscufation via webpack, and how about some source maps with that, so you can unobscufate to debug it? Yarn, requireJS, npm, and you need an end script tag, should it go in the body or the head? You know the page isn’t full loaded yet, and it won’t ever be. There, it’s done, until that timer goes off. Each element generated was just regenerated and the old ones are hidden, but the new ones have the same script with different references. Sorry, that was the old framework, use this one, it’s newer and this blog or survey says more people use it.

        For a P2P open data sharing network over https, the proxy could allow a request to get someone else down the path. Not everything is direct.

        • anonymouszx 1 day ago

          > Tor pretends to be secure, but is dark and compromised.

          Citation needed. Please stop with the "tor is compromised" meme... and what do you even mean by "dark"? What the hell... Tor is by no means a perfect anonymity solution but it's to my knowledge the best we've got. It's certainly way better than a VPN or no anonymization at all.

          More specifically, tor anonymity is limited by the fact that it's low-latency. This is a fundamental limitation of any low-latency transport layer and not the fault of the tor developers or any obscure forces. In particular, if your attacker has control of both your entry point (your tor guard node or your ISP) and your exit point (tor exit node, or the tor hidden service or website you are connecting to) , it becomes possible to de-anonymize your connection (to the specific exit point in question) through traffic analysis. There's just no way around that for a network meant to transport real-time traffic (as opposed to plain data or email for instance). And yes, it stands to reason that various intelligence agencies will have invested in running exit nodes or entry nodes but this is just unavoidable. What you can do counteract this is to run your own nodes or to donate to (presumably) trustworthy node operators.

          I think it's also worth noting that although tor can by no means 100% guarantee that you will be free from government surveillance at all times, it does make mass surveillance more difficult and more error-prone, and to me that's the whole point. Furthermore, although government surveillance cannot be thwarted 100%, tor does make corporate surveillance basically impossible (assuming you can avoid browser fingerprinting; this is what the tor browser is for).

          All in all, I can't claim tor is perfect (because it can't be!) but the more people use it the better it gets and it's certainly better than anything else, so please stop spreading FUD and encourage people to use it instead.

          Also, it's unclear to me how Stealth helps at all with hiding the IP addresses of its participants... It claims to be "private" but the README doesn't say anything about network privacy...

          • nullstyle 1 day ago

            Their claims around privacy seem pretty hand-wavy to me; I’d love to be wrong.

            As an example, their usage of DNS is simply DNS-over-HTTPS against the big public resolvers in a round-robin fashion: https://github.com/tholian-network/stealth/blob/X0/stealth/s...

            The code doesn’t strike me as concerning itself with protecting privacy so much as changing who will get to log your traffic. Interesting effort though; I’ll hope for more details from them in the future!

            • cookiengineer 1 day ago

              How would you implement it then?

              Note that the DNS queries are only done when 1) there's no host in the local cache and 2) no trusted peer has resolved it either.

              I mean, DNS is how the internet works. Can't do much about it except caching and delegation to avoid traceable specificity.

              Pretty much anything else isn't much more secure. Note also that DNS will respect the TOR proxy settings, like everything else regarding networking.

              • nullstyle 1 day ago

                “How would you implement it then?”

                Chill, bro. I said “seems hand-wavy” and “I’d love to be wrong”. I was hedging my bets and clearly indicating this was a surface-level read. I shouldn’t have to have a better alternative on deck to point out something in the codebase that didn’t seem to be privacy-friendly. No offense was meant.

                Since you asked how I would do things: I would have had a clear and detailed security-specific document or section of the readme to detail in what ways it is peer-to-peer and in what ways it is private. I would have probably gestured towards the threat model I used when designing the protocols, but —- let’s be honest —- I’d probably be too lazy to document it adequately. As far as I can tell, there’s one paragraph in its developer guide on security and two paragraphs on peer-to-peer communication and I wasn’t able to get a good read on its concrete design or characteristics.

                > Note that the DNS queries are only done when 1) there's no host in the local cache and 2) no trusted peer has resolved it either.

                This wasn’t clear to me from my first spelunk through the readme or the docs. Are you affiliated with the project? Is there a good security overview of the project you know of?

                > I mean, DNS is how the internet works. Can't do much about it except caching and delegation to avoid traceable specificity.

                What I meant to say is, I was not so sure that the google public dns could be considered private. But nevermind on that, I can’t confirm their logging policies. I’m probably just paranoid about how easy google seems to build a profile on me. So yeah, as mentioned, just my initial read.

                • cookiengineer 1 day ago

                  Hey, my comment wasn't meant in a defending manner...I'm just curious whether I maybe missed a new approach to gathering DNS data :)

                  I've seen some new protocols that try to build a trustless blockchain inspired system, but they aren't really there yet and sometimes still have recursion problems.

                  When I was visiting a friend in France I first realized how much is censored there by ISPs and cloudflare/google and others, so that's why I decided it might be a good approach to have a ronin here.

                  I totally agree that threat model isn't documented. Currently the peer to peer stuff is mostly manual, as there's no way to discover peers (yet). So you would have to add other local machines yourself in the browser settings.

                  Security wise there's currently a lot of things that are changing, such as the upcoming DNS tunnel protocol that can use dedicated other peers that are connected to the clearnet already by encapsulating e.g. https inside dns via fake TXT queries etc.

                  > public dns could be considered private

                  Totally agree here, I tried to find as many DoT and DoH dns servers as possible, and the list was actually longer before.

                  In 2019 a lot of dns providers went either broke or went commercial (like nextdns which now requires a unique id per user, which defeats the purpose of it completely)... But maybe someone knows a good DoH/DoT directory that's better than the curl wiki on github?

                  • nullstyle 18 hours ago

                    Thanks for following up with added info! I’ll look forward to seeing the project progress; It’s an area I’m super interested in. As far as naming systems better at privacy than DNS, I’m not aware of any serious options. Personally, I’m working on implementing something that hopes to improve the verifiability of naming resolutions, but thats a long ways off: https://tools.ietf.org/html/draft-watson-dinrg-delmap-02

          • StavrosK 1 day ago

            How confident are we that Tor is compromised?

            • cookiengineer 1 day ago

              > How confident are we that Tor is compromised?

              Large scale actors (read: ISPs and government agencies) have a huge amount of entry and exit nodes. They can simply measure timestamps and stream bytesizes, which allows them to trace your IP and geolocation.

              They do not have to decrypt HTTPS traffic for that, because the order of those streams is pretty unique when it comes to target IPs and timestamps.

              • TehCorwiz 1 day ago

                I was under the impression that Tor hidden services were still safe since they don't rely on exit nodes.

                • cookiengineer 1 day ago

                  Yes, hidden services are safe (well, no system is really safe). But if e.g. a hidden service includes a web resource from the clearnet, it can be traced.

                  I was talking about the "using tor to anonymize my IP" use case, where exit nodes get a huge amount of traffic per session.

                  In order to be really anon you would need a custom client side engine that randomizes the order of external resources, and pauses/resumes requests (given 206 or chunked encoding is supported), and/or introduces null bytes to have a different stream bytesize after TLS encryption is added.

                  • anonymouszx 1 day ago

                    Hidden services are safer in the sense that your connection can't be deanonymized with the help of your third relay (which would have been an exit node in the case of a clearnet connection) but if the hidden service in question were to be a honeypot and your entrypoint (ISP or tor guard node) were to be monitored by the same entity (this second requirement also holds for clearnet connection monitoring BTW), it would be possible to deanonymize your connection to the hidden service.

                    How easy it is to perform the traffic analysis would have to depend on the amount of data being transferred, if I had to guess, so downloading a video would probably be worse than browsing a plaintext forum like hackernews. But if we're talking about a honeypot, your browser could be easily tricked into downloading large-enough files even from a plaintext website (just add several megabytes of comments in the webpage source for instance).

                    > In order to be really anon you would need a custom client side engine that randomizes the order of external resources, and pauses/resumes requests (given 206 or chunked encoding is supported), and/or introduces null bytes to have a different stream bytesize after TLS encryption is added.

                    It's unclear to me how any of this helps avoid traffic analysis. I believe tor already pads data into 512-byte cells, which might help a little bit.

              • _where 1 day ago

                If you were interested in intelligence, and you wanted to maintain that pipeline of intelligence, would you give up that information?

                Act in plain sight and do good, and you’ll be ok.

                • eeZah7Ux 1 day ago

                  We aren't. This is mostly FUD.

            • rictic 1 day ago

              > fallsback to http:// only when necessary and only when the website was not MITM-ed

              How would you know when an HTTP site is being MITM'd? There are some easy cases, but for everything else, well, ensuring this is half the point and most of the operational complexity of HTTPS!

              • cookiengineer 1 day ago

                https is used primarily. If there's only http available, trusted peers are asked for two things: their host caches for that domain and whether or not the data was transferred securely via https (port and protocol).

                If either of those isn't statistically confirmed, it is assumed that the targeted website is compromised.

                Currently I think this is an as good as possible approach, but otherwise I have no idea on how to verify that the website is legit without introducing too much traffic overhead for the network.

                Personally, I wouldn't trust any http or https with tls < 1.2 website anyways. But whether or not that assumption can be extrapolated...dunno.

                Do you have another way to verify its authenticity in mind?

                • rictic 11 hours ago

                  I don't. Authentication is a pain, and the later in the stack you try to solve it the harder and hackier it gets. We have the gross hack of certificate authorities because we failed to deliver authenticated information via the domain name system.

                  > their host caches for that domain

                  Hm, would this trip a MITM flag if someone switched hosting providers? Like, if example.com is http only and is moved to a new datacenter, is there a way to distinguish this from someone MITMing the traffic?

              • dogma1138 1 day ago

                For anything security focused, Node and Electron are an odd choice.

                • mimsee 1 day ago

                  I hear this a lot and hence have the same feeling about node/electron but is there actually anything of substance when it comes to the lack security of Node/Electron apps? Honestly curious.

                  • chippy 1 day ago

                    Node implies npm spaghetti / dependency vulnerabilities (e.g. left-pad)

                    Electron I guess implies backdoors via binary blobs upstream?

                    • dogma1138 1 day ago

                      Other posters have already pointed at some examples, it boils down to that you are taking in a relatively very large attack surface and one that is rather difficult to validate effectively and exhaustively.

                      • rrdharan 1 day ago

                        Electron lags chromium HEAD and mode+electron is a complicated architecture to secure correctly:

                        https://www.blackhat.com/docs/us-17/thursday/us-17-Carettoni...

                        https://github.com/minbrowser/min/issues/440

                        • jpangs88 1 day ago

                          I feel like node/Electron are about as safe as chromium and it's npm packages that have lost everyone's confidence.

                          • dogma1138 1 day ago

                            Not exactly, the chromium/chrome sandbox isn’t dependent on how and what code you execute the electron/node one is and that is because the latter were designed to execute code across many more privilege levels than what “dedicated” browser needs.

                            If I download and build chromium (as long as I don’t disable the sandbox altogether) I don’t actually need to think about those issues while I do need to do that with Electron.

                            • _where 1 day ago

                              Is Chromium’s sandbox insecure?

                              Electron has local file access, etc. in fact, it states: “Under no circumstances should you load and execute remote code with Node.js integration enabled.”

                              So, Stealth should consider forking Electron if better sandboxing is needed.

                              https://www.electronjs.org/docs/tutorial/security

                              That doesn’t prevent it from being secure, though.

                        • eitland 1 day ago

                          Between this and the Sciter (edit: added missing r, thanks everyone :-) project the other day and a number of other projects I'm starting to get optimistic that the web might soon be ready for a real overhaul.

                          • godelmachine 1 day ago

                            May I ask for a link to Scite project? Curious.

                            • arghwhat 1 day ago
                              • hobofan 1 day ago

                                I'm not sure if GP meant sciter or scite.ai (both of which had a few posts about them recently), or even SciTE.

                                However I don't see how either of those indicate some "real overhaul of the web", as sciter seems to be "just" a embeddable HTML/CSS engine which doesn't seem like a big change compared to e.g. webview[0].

                                [0]: https://github.com/webview/webview

                                • arghwhat 1 day ago

                                  Sciter.JS is "just" an Electron alternative that is much, much lighter.

                                  Something like that could undo some of the damage done by using web tech for desktop, but it would indeed not overhaul the web in the slightest.

                                  • eitland 1 day ago

                                    > but it would indeed not overhaul the web in the slightest.

                                    Kind of agree.

                                    What I am hoping for is

                                    either a new rendering engine that purposely incompatible with abusive websites and so much faster that people like us will use it anyway for everyplace it works and just keep a mainstream browser in backup for abusive web pages,

                                    or more realistically something like asm.js where Firefox made an alternative faster path for Javascript that adhered to certain rules.

                                • mushufasa 1 day ago

                                  it does not look like that kickstarter is going to succeed -- the deadline is today at ~10% completed with all or nothing

                            • Looked through a bunch of the authors project, cool/funny stuff, keep it up.

                              A couple things;

                              - The project (and others) need more clear calls to action or goals. Reading the different pages made me think a bunch but I had no idea on what to do.

                              - Maybe, the Stealth browser is not meant for everyone. Maybe just a community of people use the browser and contribute to your goal of decentralized semantic data.

                              And really, your vision is so big, it might be worth doing a video.

                              • cookiengineer 1 day ago

                                Author of the project here. I didn't expect this to be posted on HN because the project is kind of still in its infancy.

                                The motivation behind the Browser was that I usually "use the Web" as a knowledge resource and am reading articles online, on blogs, on news websites, on social media and so on. But there are a couple of problems when seeing "what a Browser is" currently. A Browser currently is made for manual human interaction, and not for self-automation of repetitive tasks. These are currently only available at the mercy of programming or extensions, which I do not think is the reasonable way to go.

                                Why block everything except 2 things on a website when you could just grab the information you expect a website to contain?

                                I'm calling it a Semantic Web Browser because I want to build a p2p network that understands the knowledge the websites contain with its site adapters (beacons) and workflows (echoes), whereas the underlying concept tries to decentralize as much automation aspects as possible.

                                In the past a lot of websites were taken down (some for BS reasons, some not), but what's more important to me is the knowledge that is lost forever. Even if the knowledge is web-archived, the discovery (and sharing) aspects are gone, too.

                                My goal with the project is to be a network that tries to find truth and knowledge in the semantic content of the web, whereas I'm trying to build something that understands bias in articles, the authors of said articles, and history of articles that were (re-)posted on social media with biased perspectives.

                                I know that NLP currently isn't that far, but I think with swarm intelligence ideas (taken from Honeybee Democracy and similar research on bees) and the compositional game theory, it might be possible to have a self-defending system against financial actors.

                                Currently the Browser doesn't have a fully functioning UI/UX yet, and the parsers are being refactored. So it's still a prototype.

                                It is a decentralized Browser in the sense that if you've trusted peers in your local (or global) network, you can reuse their caches and share automation aspects with them (your friend's or your family's browser(s)), which allows fully decentralized (static) ways to archive websites and their links to other websites.

                                I'm not sure where the journey is heading to be honest, but I think the Tholian race and the naming makes it clear: "Be correct; we do not tolerate deceit." pretty much sums up why I built this thing.

                                Currently I don't have funding, and I'm trying to build up a startup around the idea of this "Web Intelligence Network" whereas I see a potential business model for large scale actors that want to do web scraping and/or gathering via the "extraction adapters" of websites that are maintained by the network.

                                I think this project turned out to be very important to me, especially when taking a glimpse at the post-COVID social media that contains so much bullshit that you could easily lose hope for humanity.

                                • epitactic 1 day ago

                                  This project looks amazingly promising, thank you for creating it and I wish you the best of luck in its success.

                                  One humble suggestion/idea I offer to think about, related to:

                                  > It uses trust-based Peers to share the local cache. Peers can receive, interchange, and synchronize their downloaded media. This is especially helpful in rural areas, where internet bandwidth is sparse; and redundant downloads can be saved. Just bookmark Stealh as a Web App on your Android phone and you have direct access to your downloaded wikis, yay!

                                  Trusted peers with a shared web cache is a good start, but how about _trustless_ peers? Is this possible?

                                  Possibly using something like https://tlsnotary.org - which uses TLS to provide cryptographic proof of the authenticity of saved HTTPS pages (but unfortunately only works with TLS 1.0)

                                  • cookiengineer 1 day ago

                                    I'm still reading through the code and the paper, but this sounds actually amazing.

                                    I planned on integrating a self-signing intermediary certificate for TLS anyways, so that peer-to-peer communication can be encrypted without a thirdparty handshake.

                                    It sounds like this would integrate very nicely as a hashing/verification mechanism for shared caches. Thanks much for the hint!

                                  • URfejk 1 day ago

                                    > I didn't expect this to be posted on HN because the project is kind of still in its infancy.

                                    Hi, one of my minions found it and I decided to post it here.

                                    Sorry about that.

                                    • cookiengineer 1 day ago

                                      No worries, I'm happy to see the discussion here and the ideas that other people already had ^_^

                                      Thanks much for the recognition :)

                                • Sephr 1 day ago

                                  > In case a website is not available anymore, the stealth:fix-request error page allows to download websites automagically from trusted Peers

                                  How is trust inferred? I'd rather have a trustless architecture built out of dumb pipes and Signed HTTP Exchanges.

                                  • darepublic 1 day ago

                                    Will this let you navigate to a site with video, and stream that to a peer?

                                    • cookiengineer 1 day ago

                                      All requests are shareable. Conditions for this are:

                                      1. You have a trusted peer with a local IP configured (peer A knows Peer B and vice versa)

                                      2. Peer A is downloading the url currently (stash) or is done downloading (cache)

                                      3. Peer B can then reuse the same stream or download the file via Peer A

                                      Note that stealth has for this reason also an HTML5 UI. Download a video on desktop, let stealth running and go to your Android or iOS tablet...connect to desktop-ip:65432. Open up the video and get the same stream, too :)

                                      • darepublic 17 hours ago

                                        I was more thinking could I share media tracks from a video element on a page with my peer. I wasn't sure because the browser was headless if this would be possible

                                    • _where 1 day ago

                                      I feel like this is Christmas. Thank you.

                                      • gunal2 1 day ago

                                        is it possible to bypass SSL forcing with sslstrip ?

                                        • _where 1 day ago

                                          SSL alone is not enough to protect data.

                                          Any proxy could act as a MIM, so someone using a malicious fork of Stealth may cause problems.

                                          But, the net is like this already. One site may send you to another site that tricks you into stealing your data. And, a relatively recent vulnerability subverted any WebKit-based browser from stating whether the site’s URL was using the correct server, so you’d have no visible way of knowing a site using HTTPS was legitimate.

                                          Using a VPN could be better, but it’s sometimes worse, because you change who is trusted more (the VPN provider), as they know one of the addresses you’re coming from and everything you’re doing, and can record and sell that data.

                                        • GoLogin 1 day ago

                                          I would highly recommend trying Antidetection Browser GoLogin for Multi-accounting GoLogin's advantages: All profiles are separated and protected Each profile is in a separate container so that their data do not conflict with each other. Identity protection Before using the browser, GoLogin will open a page with your connection data, so you can make sure it is safe and anonymous. Antidetect without installing software You only need to have a regular browser and Internet access, you are not tied to a specific place. Automation Automate any emulation process in a real browser. This will make your digital fingerprints look natural and your accounts will definitely not be blocked. Teamwork One-click access to any profile for each team member without any risk of blocking or leaking account data. https://gologin.com/?utm_source=forum&utm_medium=comment&utm...