Thiết kế website giá rẻ

Question

I have made a fat jar file from my project which uses spark to read some CSVs and write to Kafka easily. The project is written in java, when I run the code in IntelliJ Idea it runs ok. But when I try to run the jar file it gives org.apache.spark.sql.AnalysisException: Failed to find data source: kafka. error.

A. I have used IntelliJ Idea to build the jar file but when running the jar file, it doesn’t load or find the Main class:

Java -jar project_jar.jar nilian.Main

so, I completely gave up on this one.

B. I used Maven to build the jar file, first with this Build configuration:

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.2.4</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                                <filter>
                                    <artifact>org.apache.spark:spark-core_2.13</artifact>
                                    <excludes>
                                        <exclude>sun/nio/ch/DirectBuffer</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <transformers>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>nilian.Main</mainClass>
                                </transformer>
                            </transformers>
                            <createDependencyReducedPom>false</createDependencyReducedPom>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

with this result while running the jar file:

Exception in thread "Thread-25" org.apache.spark.sql.AnalysisException: Failed to find data source: kafka. Please deploy the application as per the deployment section of Structured Streaming + Kafka Integration Guide.
        at org.apache.spark.sql.errors.QueryCompilationErrors$.failedToFindKafkaDataSourceError(QueryCompilationErrors.scala:1568)
        at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:645)
        at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:697)
        at org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:863)
        at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:257)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:248)
        at nilian.outputstuff.writers.KafkaWriter.write(KafkaWriter.java:21)
        at nilian.outputstuff.MyOutPutResolver.resolveTrafficOutPut(MyOutPutResolver.java:53)
        at nilian.threadstuff.runnables.TrafficRunnable.run(TrafficRunnable.java:171)
        at java.base/java.lang.Thread.run(Thread.java:840)

C. Another time with this build configuration and with the same result :

<build>
    <plugins>
        <plugin>
            <artifactId>maven-assembly-plugin</artifactId>
            <configuration>
                <archive>
                    <manifest>
                        <mainClass>nilian.Main</mainClass>
                    </manifest>
                </archive>
                <descriptorRefs>
                    <descriptorRef>jar-with-dependencies</descriptorRef>
                </descriptorRefs>
                <transformers>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                        <mainClass>nilian.Main</mainClass>
                    </transformer>
                </transformers>
            </configuration>
            <executions>
                <execution>
                    <id>make-assembly</id>
                    <phase>package</phase>
                    <goals>
                        <goal>single</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

D. I better be saying that I tried to run the jar file with this command, but it didn’t work out! I mean I still got error on not finding kafka source:

java --add-exports=java.base/sun.nio.ch=ALL-UNNAMED -cp "project_jar.jar:KAFKA_JARs/*" -jar project_jar.jar

Thanks for your help!

Thiết kế website giá rẻ

Danh mục

Failure in finding Kafka source (Spark) [org.apache.spark.sql.AnalysisException: Failed to find data source: kafka]