Giter Club home page Giter Club logo

Comments (14)

AllanCameron avatar AllanCameron commented on June 26, 2024 1

I think for consistency we should. So far we have largely stuck to the principle of least astonishment, and I think text spacing applying to all (non-language) strings is what users would expect.

from geomtextpath.

AllanCameron avatar AllanCameron commented on June 26, 2024

Yes, it's pretty ambitious, but I agree it would be a very nice addition, and I am starting to have faith in our ability to manage such things.

It seems to me that since we are suggesting such fancy additions as sf and rich text support that we are close to being satisfied with the core features of the package. We should probably have a think about what we want to be available in the package on its initial CRAN submission. My wish list would be:

  • A labelpath equivalent of each of the line geoms
  • Thorough documentation
  • 100% test coverage
  • Tidy code with an emphasis on consistency, readability and simplicity
  • Some rigorous exploration of edge cases

But most of this stuff is of course less exciting than creating cool new features, and I am enjoying myself.

I am happy to look into a geom_textsf if you want to explore the rich text support. I don't think either of them are essential, but then we are probably going to want to try to add them at some point anyway, and we are in no rush. I'd be interested to hear your thoughts on what you would consider as essential before considering CRAN submission.

from geomtextpath.

AllanCameron avatar AllanCameron commented on June 26, 2024

Having had a look at the source code for gridtext, the css / html parsing is pretty basic, using regex and so on. We could easily write something ourselves that does the job. I have some experience in writing tokenizers and parsers in C++, and parsing markdown with a limited subset of html (as gridtext currently supports) would actually be pretty easy. Normally C++ would be better for this kind of thing because of its speed, but even Claus Wilke (who likes to use C++) doesn't seem to have bothered because the strings being processed are so short.

The main problem I foresee is that to preserve the curves we would have to build all of the rich text implementation at grob level. This would involve quite a bit of modification and added complexity to the textpath Grob. If you are very keen to implement this, I think it would be best to have a separate richtextpathGrob that re-uses the helper functions rather than trying to add the functionality to textpathGrob.

from geomtextpath.

teunbrand avatar teunbrand commented on June 26, 2024

Yes I agree on this being separate from the regular textpathGrob(). Presumably, the goal of our package is not to published encyclopaedic books with curved text, so I think that staying within R with slightly slower speed is reasonable given that C++ is harder to maintain (I don't want to be bogged down with compilation issues on Solaris, for example). Also, I don't see why the parsing can't be handled mostly outside of the makeContent() method, so the speed doesn't matter (as much).

My wish list would be:

These all seem reasonable, but there are some points I would partially deviate from. I would trade-off tidiness of code for efficiency and maintainability. All else being equal, tidy code is better than untidy code, but I'd find it more important that code is efficient and easy to maintain. The 100% test coverage is probably also a bit ambitious, especially given that the coverage of ggproto methods isn't measured. Test coverage alone doesn't really mean anything if the tests themselves aren't any good. What the tests should be used for is to warn whenever a change breaks something elsewhere, so that we can tinker with the code in whatever way we like with the peace of mind that we'll know when we'd break something. Again, all else being equal, more coverage is better than less coverage.

from geomtextpath.

AllanCameron avatar AllanCameron commented on June 26, 2024

The 100% test coverage is probably also a bit ambitious, especially given that the coverage of ggproto methods isn't measured.

We have reached 100% code coverage on several occasions during this project (I was very pleased to see the green badge when this happened!), for example after this commit. covr has a great tool for showing you the code paths that are and aren't covered by your tests, so it is actually quite easy to write tests purely to ensure that all code paths are covered. The ggproto methods are counted as having being covered when they are called during plot drawing - we have the odd expect_silent(..plot code...) in the tests that give us a heads up if the code breaks badly enough that a plot doesn't actually get produced.

Of course, I agree that the test coverage itself doesn't tell us much; it is the stability of the program under specific inputs that we need to ensure, and this is what I meant by rigorous exploration of edge cases. However, I have learned by experience that when there are some code paths that don't get tested, they end up not doing what you expect them to do when an end-user stumbles upon them. There are also many instances when you realise that a particular conditional is never tested because it is a logical impossibility for any input to trigger it.

I'm fine with keeping everything in R rather than writing code that needs to be compiled. My experience with writing a large and complex C++ package is that as long as you are writing straightforward code with standard library functions you don't get any complaints from an R CMD check, but I agree it would just be overkill for what you're proposing.

I would trade-off tidiness of code for efficiency and maintainability

I think the key there is maintainability, and for this I think code should be easy to read and understand. I have to admit it takes me a while to get my head around some of the tactics we use to avoid loops, and that makes it a bit harder to debug. I know that some code has "irreducible complexity" (to borrow a terrible phrase), but complex parts can often be broken down into maintainable and understandable functions. Don't get me wrong; I'm pretty happy with the code base as it is.

What are your thoughts on looking at the rich text implementation? It does look like a lot of work, but not insurmountable.

from geomtextpath.

teunbrand avatar teunbrand commented on June 26, 2024

What are your thoughts on looking at the rich text implementation?

This seems within the realm of feasibility, so I think the first step is to ask Claus Wilke whether he is OK with us reusing some of his code. I know there isn't anything stopping us legally, since {gridtext} is licenced under MIT, but it just seems like bad etiquette to not ask. If we get a no, we can always start from scratch.

you don't get any complaints from an R CMD check

I'm not so much worried about R CMD check, I'm more worried about CRAN detecting a potential problem that is difficult to debug without having access to the OS in which the problem was detected.

some of the tactics we use to avoid loops

Yes, I do have an active tendency to avoid loops whenever I see an opportunity to vectorise the code, I admit. I agree with you that loops are easier to read in many cases, and it will probably cost only a few milliseconds or less to leave a loop as-is, but this adds up the more loops you have. I have no problem with containerising a loop-to-vector re-coding into informatively named functions.

from geomtextpath.

teunbrand avatar teunbrand commented on June 26, 2024

Hi Alan,

Conceptually, I'm running into the following issue. The makeContent.textpath and makeContent.richtextpath are going to be near identical, except for 2 lines of code that would track the substring for rich text. My worry is that we can easily add these two lines to the textpath method, but we've been keen to add stuff to this already. Do you think we can sacrifice two changing lines in the texpath method to avoid some duplicated code? Or do you have a strong preference to keep them separate (and does your opinion change if it were 6 lines of code, for example)?

The current status of this issue is that I've got the grobs working with the <p> and <span> tags with a limited subset of the style attribute. Next, I'm working towards a geom_richtextpath() after which I can expand the number of supported tags to mirror a good portion of {gridtext}.

image

from geomtextpath.

AllanCameron avatar AllanCameron commented on June 26, 2024

This looks like good progress Teun. I don't mind if there's a few extra lines of code. I like to reduce parameters where possible, but I'm guessing this whole mechanism only needs a single "parse_richtext" switch added to the textpathGrob?

On balance, I prefer not having duplicated code or near-identical functions if there is a way to do this inside the existing code.

Incidentally, how are you handling the parsing? Did you write to Claus Wilkes, or did you roll your own?

from geomtextpath.

teunbrand avatar teunbrand commented on June 26, 2024

I'm guessing this whole mechanism only needs a single "parse_richtext" switch added to the textpathGrob?

That is indeed the essence, the two other extra bits is that measure_richtext() and gp_text are using substrings, but that is all upstream of the makeContent method.

how are you handling the parsing?

I indeed wrote Claus Wilke an email, who told me that we could pretty much do what ever we want, since his code has the MIT licence. I now have a comment on top of the file of borrowed functions that the copyright to code in that file belongs to Claus and not us. I still have to investigate if we could simplify, as we don't need to keep track of textbox settings.

from geomtextpath.

AllanCameron avatar AllanCameron commented on June 26, 2024

Excellent.

We seem to have a problem with the CMD check on the latest Ubuntu, but it doesn't seem we've done anything to cause it - the setup won't allow it to install some packages. I'm not sure what we can do about this other than exclude the latest experimental Ubuntu off our CMD check.

from geomtextpath.

teunbrand avatar teunbrand commented on June 26, 2024

Here is my to-do list for this issue:

  • Investigate why the PR fails on R devel.
  • Double check that nested <sub>/<sup> tags are working correctly.
  • Integrate with textpathGrob()
    • Include a rich = TRUE/FALSE switch to turn the functionality on or off.
    • Add rich parameter to the static_text_params() function to have it automatically available in all children of GeomTextpath/GeomLabelpath. Probably throw error when rich = TRUE and parse = TRUE as I don't think we can mix plotmath with rich-text.
    • Implement text measurement for straight labels.

from geomtextpath.

teunbrand avatar teunbrand commented on June 26, 2024

the setup won't allow it to install some packages.

I haven't seen this in GHA, but it is throwing errors on my PR for other reasons, so I'll investigate. I don't think turning it off is the wisest course of action as it will increase the chance of unwelcome surprises if/when we submit to CRAN (which also requires a package to run on R devel).

from geomtextpath.

AllanCameron avatar AllanCameron commented on June 26, 2024

The GHA issue I mentioned above seems to have been a temporary issue that resolved itself after I waited a few hours and re-ran it.

from geomtextpath.

teunbrand avatar teunbrand commented on June 26, 2024

I just realised while working on rich, straight labels a bit: the 'spacing' argument is ignored when straight = TRUE. This is also true for non-rich text.

library(geomtextpath)
#> Loading required package: ggplot2
#> Warning: package 'ggplot2' was built under R version 4.1.1

df <- data.frame(
  x = c(1, 2),
  y = 0,
  label = "Test"
)

ggplot(df, aes(x, y, label = label)) +
  geom_textpath(size = 8, spacing = 1000)

ggplot(df, aes(x, y, label = label)) +
  geom_textpath(size = 8, spacing = 1000, straight = TRUE)

Created on 2021-12-28 by the reprex package (v2.0.1)

Should we run all non-language straight labels through the letter-wise textshaping steps? We can still gain some efficiency by applying place_text() on the aggregate instead of individual letters.

from geomtextpath.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.