Giter Club home page Giter Club logo

Comments (6)

MartinPacker avatar MartinPacker commented on July 17, 2024 1

And what would be the strategy for doing this - in Python code? I ask, @scanny, because this logic is probably common to other removals.

from python-pptx.

scanny avatar scanny commented on July 17, 2024 1

Basically dig out the relationship and delete it.

The relationship(s) would be identified by an embed or link element with rId="rId{N}" I believe, dumping the XML for the moving shape would give you and idea.

Then you need to get to the slide part because that's the source side of the relationship, so something like:

slide_part = slide.part
slide_part.rels.drop_rel("rIdN")

Somebody can dig through and refine that with actual code if they have a mind to :)

from python-pptx.

scanny avatar scanny commented on July 17, 2024 1

Okay, so a couple possible approaches:

  1. do the repair and save it to a separate file. Then compare the XML from the original to the repaired version to see how PowerPoint "fixes" the presentation.
  2. Extract the original powerpoint to a directory ($ unzip original.pptx). Then make the changes by hand, re-zip the presentation into a PPTX file and keep trying things until it works.

The opc-diag tool was built for this kind of exploration:

  • you'll need to install from the develop branch on GitHub for it to work with Python 3: https://github.com/python-openxml/opc-diag/commits/develop/. Pretty sure it's something like: pip install -U git+https://github.com/python-openxml/opc-diag.git@develop
  • documentation is here: https://opc-diag.readthedocs.io/en/latest/index.html
  • The diff, extract, and repackage subcommands are most useful for this work. In particular, just unzipping a PPTX leaves all the content in any of the XML files on a single line, which of course is hard to edit. opc-diag automatically reformats that nicely for you.

You might want to do a mix of these two approaches. The diff approach is good when you have no clue of what changes are required. The edit->repackage->try cycle is best when you have a pretty good idea what changes to try.

from python-pptx.

MartinPacker avatar MartinPacker commented on July 17, 2024

There's probably rather more to deleting a movie than removing a chunk of XML.

from python-pptx.

scanny avatar scanny commented on July 17, 2024

@shoang22 you're going to want to remove the relationship from the slide (package) part to the part containing the movie (Media part maybe?). Otherwise I expect PowerPoint isn't going to like seeing the orphaned movie. Not sure if that's the whole problem, unfortunately the repair error doesn't give us any idea of what it figures to be a "problem with content".

from python-pptx.

shoang22 avatar shoang22 commented on July 17, 2024

Somebody can dig through and refine that with actual code if they have a mind to :)

Something like this?

def remove_movie(file_path: str) -> None:
    slides_folder = os.path.dirname(file_path) + "/slides"
    os.makedirs(slides_folder, exist_ok=True)
    prs = pptx.Presentation(file_path)
    for idx, slide in enumerate(prs.slides):
        for shape in slide.shapes:
            if type(shape) == pptx.shapes.picture.Movie:
                p = slide.part
                x = etree.fromstring(p.rels.xml)
                before = etree.tostring(x, pretty_print=True)
                print(before.decode())
                vid = shape.element
                vid.getparent().remove(vid)
                p.rels.pop("rId2") 
                y = etree.fromstring(p.rels.xml)
                after = etree.tostring(y, pretty_print=True)
                print(after.decode())
    
    prs.save(file_path.rpartition(".")[0] + "_no_movies.pptx")

Prints:

<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
  <Relationship Id="rId1" Type="http://schemas.microsoft.com/office/2007/relationships/media" Target="../media/media1.mp4"/>
  <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/video" Target="../media/media1.mp4"/>
  <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/slideLayout" Target="../slideLayouts/slideLayout1.xml"/>
  <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/notesSlide" Target="../notesSlides/notesSlide1.xml"/>
  <Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="../media/image1.png"/>
  <Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="../media/image2.jpeg"/>
</Relationships>

<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
  <Relationship Id="rId1" Type="http://schemas.microsoft.com/office/2007/relationships/media" Target="../media/media1.mp4"/>
  <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/slideLayout" Target="../slideLayouts/slideLayout1.xml"/>
  <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/notesSlide" Target="../notesSlides/notesSlide1.xml"/>
  <Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="../media/image1.png"/>
  <Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="../media/image2.jpeg"/>
</Relationships>

But I'm still getting the same error when I attempt to open the file.

from python-pptx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.